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Preface 


THE AIM and contents of the book are briefly described in the Introduction. 
My debt to those who have written on the same subject before me is recorded 
in the footnotes and in the list of references on pp. 351-379. Here I wish to 
acknowledge my obligation and to convey my warmest thanks to those who 
knowingly and willingly have aided me in my work. 

Mario Bunge encouraged me to write the book for this series. I also owe him 
much of my present understanding of fundamental physics as natural 
philosophy, in the spirit of Aristotle, Newton and Einstein, and in opposition 
to my earlier instrumentalist leanings. 

Carla Cordua stood by me during the long and not always easy time of 
writing, and gracefully put up with my absentmindedness. She heard much of 
the book while still half-baked, helping me with her fine philosophical sense 
and her keen awareness of style to shape my thoughts and to find the words to 
express them. 

John Stachel read the first draft of Chapter 5 and contributed with his vast 
knowledge and his incisive intelligence to dispel some of my confusions and to 
correct my errors. He wrote the paragraphs at the end of Section 5.8 (pp. 181 
ff.) on the relation between gravitational radiation and the equations of 
motion in General Relativity, an important subject of current research I did 
not feel competent to deal with. 

Adolf Grinbaum, whose “Geometry, Chronometry and Empiricism” 
awakened my interest in the philosophical problems of relativity and 
geometry, has kindly authorized me to quote at length from his writings. 

David Malament corrected a serious error in my treatment of causal spaces. 

Roger Angel, Mario Bunge, Alberto Coffa, Clark Glymour, Adolf 
Grunbaum, Peter Havas, Bernulf Kanitscheider, David Malament, Lewis 
Pyenson, John Stachel and Elie Zahar have sent me copies of some of their 
latest writings. John Stachel, Kenneth Schaffner and my dear teacher Felix 
Schwartzmann have called my attention to some papers I would otherwise 
have overlooked. My son Christian, to whom the book is dedicated, has often 
verified references and procured me materials to which I did not have direct 
access. 

The Estate of Albert Einstein has kindly authorized me to reproduce the 
quotations from Einstein’s correspondence. Part of them I obtained in the 
Manuscript Room of the Firestone Library in Princeton University with the 
expert and polite assistance of its staff. During my short stay in Princeton I 
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enjoyed the pleasant hospitality of the Institute for Advanced Study. 

I began to write in August 1979 and finished today. In the second half of 
1980 the University of Puerto Rico freed me from my teaching and 
administrative duties and the John Simon Guggenheim Memorial 
Foundation granted me a fellowship to pursue my work. I am deeply grateful 
to both institutions for their renewed trust and their generous support. 


Rio PrepRAS, August 15th, 1981. 


Naturwissenschaft ist der Versuch die Natur durch genaue Begriffe 
aufzufassen. 
BERNHARD RIEMANN 


Nach unserer bisherigen Erfahrung sind wir némlich zu dem Vertrauen 
berechtigt, dass die Natur die Realisierung des mathematisch denkbar 
Einfachsten ist. Durch rein mathematische Konstruktion vermégen 
wir nach meiner Ueberzeugung diejenigen Begriffe und diejenige 
gesetzliche Verkniipfung zwischen ihnen zu finden, welche den 
Schliissel fiir das Verstehen der Naturerscheinungen liefern. 

ALBERT EINSTEIN.* 


Introduction 


1 


GEOMETRY grew in ancient Greece as the science of plane and solid figures. 
Purists would have had its scope restricted to figures that can be constructed 
with ruler and compass, but its subject-matter somehow led by itself to the 
study of curves such as the quadratrix (Hippias, 5th century B.C.), the spiral 
(Archimedes, 287-212 B.C.), the conchoid (Nicomedes, ca. 200 B.C.), the 
cissoid (Diocles, ca. 100 B.C.), which do not fit into that description. The 
construction of a figure brings out points, which one then naturally regards as 
pre-existing —at least “potentially”—in a continuous medium that provides, so 
to speak, the material from which the figures are made. Understandably, 
geometry came to be conceived of as the study of this medium, the science of 
points and their relations in space. This idea is in a sense already implicit in the 
Greek problem of loci—i.e. the search for point-sets satisfying some pre- 
scribed condition. It is fully and openly at work in Descartes’ Géométrie (1637). 
When space itself, that is the repository of all points required for carrying out 
the admissible constructions of geometry, was identified with matter 
(Descartes), or with the locus of the divine presence in which matter is placed 
(Henry More), the stage was set for the foundation of natural philosophy upon 
geometrical principles (Newton). Inspired to a considerable extent by 
Newton’s success, Kant developed his view of geometry as a paradigm of our 
a priori knowledge of nature, the surest proof that the order of nature is an 
outgrowth of human reason. 

For all its alluring features, Kant’s philosophy of geometry could not in the 
long run stand its ground before the onrush of the many geometries that 
flourished in the 19th century. Bold yet entirely plausible variations of the 
established principles and methods of geometrical thinking gave rise to a 
wondrous crop of systems, that furnished either generalizations and exten- 
sions of, or alternatives to, the geometry of Euclid and Descartes. The almost 
indecent fertility of reason definitely disqualified it from prescribing the 
geometrical structure of nature. The choice of a physical geometry, among the 
many possibilities offered by mathematics, was either a matter of fact, to be 
resolved by experimental reasoning (Gauss, Lobachevsky), or a mere matter of 
agreement (Poincaré). 

Two widely different proposals were made, in the third quarter of the 
century, for ordering and unifying the new disciplines of geometry; namely, 
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Riemann’s theory of manifolds, and Klein’s Erlangen programme.’ Riemann 
sought to classify physical space within the vast genus of structured sets or 
“manifolds” of which it is an instance. While countenancing the possibility that 
it might ultimately be discrete, he judged that, at any rate macroscopically, it 
could be regarded as a three-dimensional continuum (ie., in current parlance, 
as a topological space patchwise homeomorphic to R*). Lines, i.e. one- 
dimensional continua embedded in physical space, had a definite length 
independent of the manner of their embedding. The practical success of 
Euclidean geometry suggested, moreover, that the infinitesimal element of 
length was equal, at each point of space, to the positive square root of a 
quadratic form in the coordinate differentials, whose coefficients, however, 
would in all likelihood—for every choice of a coordinate system—vary 
continuously with time and place, in utter contrast with the Euclidean case. 
Riemann’s theory of metric manifolds was perfected by Christoffel, Schur, 
Ricci-Curbastro, etc. and would probably have found an application to 
classical mechanics if Heinrich Hertz had been granted the time to further 
develop his ideas on the subject. But it was not taken seriously as the right 
approach to physical geometry until Albert Einstein based on it his theory of 
gravitation or, as he preferred to call it, the General Theory of Relativity. 
Klein defined a geometry within the purview of his scheme by the following 
task: Let there be given a set and a group acting on it; to investigate the 
properties and relations on the set which are not altered by the group 
transformations.” A geometry is thus characterized as the theory of invariants 
of a definite group of transformations of a given abstract set.? The known 
geometries can then be examined with a view to ascertaining the transform- 
ation group under which the properties studied by it are invariant. Since the 
diverse groups are partially ordered by the relation of inclusion, the geometries 
associated with each of them form a hierarchical system. Klein’s elegant and 
simple approach won prompt acceptance. It encouraged some of the deepest 
and most fruitful mathematical research of the late 19th century (Lie on Lie 
groups; Poincaré on algebraic topology). It stimulated the advent of the 
conventionalist philosophy of geometry.* Its most surprising and perhaps in 
the long run historically most decisive application was the discovery by 
Hermann Minkowski that the modified theory of space and time on which 
Albert Einstein (1905d) had founded his relativistic electrodynamics of moving 
bodies was none other than the theory of invariants of a definite group of 
linear transformations of R*, namely, the Lorentz group.° It was thus shown 
that the new non-Newtonian physics—known to us as the Special Theory of 
Relativity—that was being then systematically developed by Einstein, Planck, 
Minkowski, Lewis and Tolman, Born, Laue, etc., rested on a new, non- 
Euclidean, geometry, which incorporated time and space into a unified 
“chronogeometric” structure. Minkowski’s discovery not only furnished a 
most valuable standpoint for the understanding and fruitful development of 
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Special Relativity, but was also the key to its subsequent reinterpretation as the 
local—tangential—-approximation to General Relativity. The latter, of course, 
was conceived from the very beginning as a dynamical theory of spacetime 
geometry-—of a geometry, indeed, of which the Minkowski geometry is a 
unique and rather implausible special case. 

The main purpose of this book is to elucidate the motivation and 
significance of the changes in physical geometry brought about by Einstein, in 
both the first and the second phase of Relativity. However, since the geometry 
is, in either phase, inseparable from the physics, what the book in fact has to 
offer is a “historico-critical” exposition of the elements of the Special and the 
General Theory of Relativity. But the emphasis throughout the book is on 
geometrical ideas, and, although I submit that the standpoint of geometry is 
particularly congenial to the subject, I also believe that there is plenty of room 
left for other comparable studies that would regard it from a different, more 
typically “physical” perspective.® 


The book consists of seven chapters and a mathematical appendix. The first 
two chapters review some of the historical background to Relativity. Chapter 1 
deals with the principles of Newtonian mechanics—space and time, force and 
mass—as they apply, in particular, to the foundation of Newtonian kine- 
matics. Chapter 2 summarily sketches some of the relevant theories and results 
of pre-Relativity optics and electrodynamics.’ The next two chapters refer to 
Special Relativity. Chapter 3 is mainly devoted to Einstein’s first Relativity 
paper of 1905. It carefully analyses Einstein’s definition of time on an inertial 
frame—which, as I will show, can be seen as a needful improvement of the 
Neumann-Lange definition of an inertial time scale—and his first derivation 
of the Lorentz transformation. It also comments on some implications of the 
Lorentz transformation, such as the relativistic effects usually described as 
“length contraction” and “time dilation”; on the alternative approach to the 
Lorentz transformation initiated by W. von Ignatowsky (1910), and on the 
relationship between Einstein’s Special Theory of Relativity and the con- 
temporary, predictively equivalent theory of Lorentz (1904) and Poincaré 
(1905, 1906). Chapter 4 presents the Minkowskian formulation of Special 
Relativity with a view to preparing the reader for the subsequent study of 
General Relativity. Thus, Sections 4.2 and 4.3 (together with Sections A and B 
of the Appendix) explain the concept of a Riemannian n-manifold and of a 
geometric object on an n-manifold, and Section 4.5 introduces von Laue’s 
stress-energy-momentum tensor. Section 4.6 sketches the theory of causal 
spaces originating with A. A. Robb (1914) and perfected in our generation by 
E. Kronheimer and R. Penrose (1967). Chapter 5 deals with Einstein’s search 
for General Relativity from 1907 to 1915. As Iam not altogether satisfied with 
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earlier descriptions of this development,’ I pay here special attention to 
historical detail and quote extensively from the primary sources. For 
completeness’ sake I have added, in Section 5.8, a brief qualitative apercu of 
Einstein’s later work on the equations of motion of General Relativity. 
Chapter 6 studies some aspects and subsequent developments of the theory 
which I believe are especially significant from a philosophical point of view, 
such as the Weyl-Ehlers—Pirani-Schild conception of a geometry of hight 
propagation and free fall, the so-called Mach Principle, the standard model of 
relativistic cosmology and the idea of a spacetime singularity. Chapter 7 is 
more explicitly philosophical and deals with the concept of simultaneity, 
geometric conventionalism, and a few other questions concerning spacetime 
structure, causality and time. The Appendix was designed to furnish a 
definition of spacetime curvature within the theory of connections in the 
principal bundle of tetrads. On our way to this definition every concept of 
differential geometry that is used, but not explained, in the main text is defined 
as well. 

Appended to the main text is a sizable number of notes. They are printed at 
the end of the book, because I did not want the reader to be continually 
distracted by their presence at the foot of the page. He should disregard them 
until he finishes reading the respective section of the text. Besides the usual 
references to sources, the notes contain: (i) definitions of mathematical terms 
and other supplementary explanations that may be of assistance to some 
readers; (ii) additional remarks, quotations, proofs and empirical data, that 
support, illustrate or refine the statements in the main text, but may be omitted 
without loss of continuity; (ii) suggestions for further reading. Notes of type (i) 
are usually separate from those of type (ii) and (11). am sure that each reader 
will soon learn to decide at a glance which notes deserve his or her attention. 

Equations and formulae are designated by three numbers, separated by 
points. The first two numbers denote the chapter and section in which the 
equation or formula occurs. Those contained in the Appendix are designated 
by a capital A, B, C or D, followed by a number—the letter denoting the 
pertinent part of the Appendix. 

The list of References on pages 351 ff. provides bibliographical information 
on every book and article mentioned in the main test, the Appendix or the 
notes. It also contains a few items pertaining to our subject which I have found 
useful, but which I have not had the opportunity of quoting in the book. 
Normally references are identified by the author’s name followed by the 
publication year. When I refer to two versions of the same work published in 
the same year, the second version is identified by an asterisk after the year (e.g. 
“Lorentz (1899*)”). When several works by the same author appeared in the 
same year they are distinguished by small italic letters, which do not have any 
further significance. Please be advised that in such cases references in the text 
and notes often omit the letter a (but not the letters b, c, etc.). Thus, “Weyl 
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(1919)” may stand for “Weyl (1919a)”. In a few instances in which I do not 
quote a work after the first edition, I identify it by some suitable abbreviation, 
instead of the publication year (e.g. “Newton (PNPM), instead of “Newton 
(1972)”, which would be incongruous if not misleading). 


A book like this one would be impossibly long if it tried to explain every 
notion employed in it. The mathematical concepts more directly relevant to an 
understanding of Relativity and Geometry belong to differential geometry and 
will be defined either in the main text (chiefly in Sections 4.2, 4.3, 5.4, 5.5, 6.1 
and 6.4) or in the Appendix.® Their definition, however, will be in terms of 
other mathematical concepts. Specifically, I assume that the reader has: 


(i) a clear recollection of the fundamental concepts of the calculus (on 
R"),!° even if he is no longer able to work with them; 

(ii) a firm grasp of the elements of linear algebra, up to and including the 
notion of a multilinear mapping of a finite-dimensional module or 
vector space into another.!! 


In section 5.7 I have tried to reproduce Einstein’s reasoning in the three papers 
of November 1915 in which he proposed three alternative systems of 
gravitational field equations—culminating with the one that is now named 
after him. The calculations can be easily followed, using the hints in the 
footnotes, by someone who has a working knowledge of the elements of the 
tensor calculus.'? Since the said Section 5.7 will probably be omitted by 
readers who have not benefited from a certain measure of formal mathematical 
training, I took the liberty to use in it some basic ideas and results of the 
calculus of variations which are not explained—and not otherwise 
mentioned—in this book. 

It is a good deal harder to say precisely what knowledge of physics is 
presupposed. Presumably no one will turn to Relativity who is not somehow 
acquainted with classical mechanics and electrodynamics; and Chapters 1 and 
2 are meant indeed for readers who have already made this acquaintance. In 
particular, the critical analysis of Newtonian space, time and inertia in Chapter 
1 could be misleading for someone who has yet to learn about them in the 
ordinary way. All the requisite ideas of Relativity are duly explained; but the 
treatment of some of the better-known consequences of the Special Theory— 
particularly in Section 3.5—is perhaps too laconic. However, it is doubtful that 
anybody would try to read the present book without studying first one of the 
Relativity primers in circulation, such as Einstein’s own “popular exposition”, 
or Bondi’s Relativity and Common Sense, or, better still, Taylor and Wheeler’s 
Spacetime Physics, where those consequences are amply discussed.’ On the 
other hand, I do not assume any previous knowledge of General Relativity, but 
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rather seek to give in Chapters 4 and 5 a historical introduction to this theory. 
However, due to the book’s peculiar bias and its almost total neglect of 
experimental data, it is highly advisable that beginners supplement it with 
some other introductory book written from a physicist’s point of view. At the 
time of writing my own favourite for this job was Wolfgang Rindler’s Essential 
Relativity (second, revised edition, 1977), which also furnishes an introduction 
to the Special Theory.!+ 


Throughout the book I have tried to use only standard terminology and 
notation. However, there are several cases in which the standards are unstable. 
For the benefit of knowledgeable readers who may wish to consult something 
in the middle of the book without having to track down the definition of terms 
they are already familiar with, I shall here draw up a list of such cases, stating 
my choice among the available alternatives. Readers unacquainted with the 
ideas involved should omit the rest of this §4 and wait until the relevant 
definitions and conventions are in due course formally introduced. 

The fibre of a mapping fover a given value is the set of objects in the domain 
of f at each of which f takes that value. (The fibre of f over a is {x| f(x) = a}). 

A curve ina manifold M is always understood to be a parametrized curve, i.e. 
a continuous—usually differentiable—mapping of a real interval into M. The 
curve’s range, i.e. the subset of M onto which the said interval is mapped by it, 
is called a path. 

A basis of the tangent space at a point of an n-dimensional differentiable 
manifold is called an n-ad (rather than a frame or repére). A section of the 
bundle of frames over such a manifold is an n-ad field (instead of a frame field or 
a repere mobile). If n = 4, we speak of tetrads and tetrad fields. 

Beginning on page 23, the Latin indices i, j,k... range over {0, 1, 2,3}; the 
Greek indices a, B,y... range over {1, 2,3}. 

The Einstein summation convention is introduced on page 89 and used 
thereafter unless otherwise noted. 

We speak of Riemannian (definite or indefinite) and proper Riemannian 
(positive definite) metrics—meaning what are often called Semi-Riemannian 
and Riemannian metrics, respectively. 

We speak of the Lorentz group (including translations) and the homogeneous 
Lorentz group—meaning what are nowadays more frequently called the 
Poincaré group and the Lorentz group, respectively. 

The flat Riemannian metric of Minkowski spacetime is denoted by 4. The 
matrix of its components relative to an inertial coordinate system with 
standard time coordinate is [nj] = diag (1, — 1, — 1, — 1). The signature of all 
relativistic metrics is therefore (+ — — —). 

The Lorentz factor 1/./1—v?/c? is designated by , or, where there is no 
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danger of confusion, simply by £, as in Einstein (1905d). (The popular symbol y 
does not have the same distinguished pedigree as B. Moreover, y is here pre- 
empted to serve as the standard name of a typical curve.) 

The prefix world- indicates appurtenance to a spacetime. A worldline is any 
curve in spacetime—not necessarily a timelike one. 

For brevity’s sake, objects are said to move or rest ina frame of reference— 
meaning relatively to it. 

Beginning on page 89 and unless otherwise noted, the vacuum speed of 
light c is set equal to 1. 

Let me add finally that in this book the classical gravitational potential 
increases as one recedes from the source (the Newtonian gravitational force is 
therefore equal to minus the gradient of the potential); that relativistic 4- 
momentum is defined as a vector (not a covector as in current literature); and 
that the Ricci tensor is formed by contracting the Riemann tensor with respect 
to the first and /ast (not the third) indices. 

Many of these conventions do not express the author’s persona! preference, 
but arise rather from the necessity of conforming as far as possible with the 
usage of Einstein and his contemporaries. 


A final word to the reader. When, ina book of this sort, the author qualifies a 
statement with such adverbs as “evidently”, “obviously”, “clearly”, what he 
thereby intends to say is that in order to see that the statement in question is 
true a small effort of attention and reflection will usually be sufficient. The 
qualifying adverb is meant to encourage the reader to make this small effort, 
not to humiliate him because he cannot do without it. Genuinely self-evident 


statements need not bear a tag to remind one that they are such. 


CHAPTER 1 


Newtonian Principles 


1.1. The Task of Natural Philosophy 


In the Preface to the Principia, Newton says that the whole task of 
philosophy appears to consist in this: To find out the forces of nature by 
studying the phenomena of motion, and then, from those forces, to infer the 
remaining phenomena. This—he adds—is illustrated in the third Book of his 
work, where, from celestial phenomena, through propositions mathematically 
demonstrated in the former Books, he derives the forces of gravity by which 
bodies tend to the sun and to the several planets, and then, from these forces, 
with the aid of further mathematical propositions, he deduces the motions of 
the planets, the comets, the moon, and the sea. He expects that all other 
phenomena of nature might be derived by the same kind of reasoning, for he 
surmises, on many grounds, that they all depend upon certain forces by which 
the particles of bodies, by causes still unknown, are either mutually impelled to 
one another, and cohere in regular figures, or are repelled and recede from one 
another. ! 

Newton’s programme turns on the concepts of motion and force. Unlike 
medieval writers, Newton does not understand by motion—motus—every kind 
of quantitative and qualititative change, but only that which was formerly 
called motus localis, that is, locomotion or change of place.? Force—vis—is 
“the causal principle of motion and rest”.? In the early manuscript from which 
I quote these definitions, Newton neatly distinguishes two kinds of force. It is 
“either an external principle which, when impressed in a body, generates or 
destroys or otherwise changes its motion, or an internal principle, by which the 
motion or rest imparted to a body are conserved and by which any being 
endeavours to persevere in its state and resists hindrance”.* These two kinds of 
force are, respectively, the subject of Definitions IV and III of the Principia, 
where the latter is referred to as “the innate force (vis insita) of matter”, and the 
former as “impressed force (vis impressa)”.* Innate force is said to be 
proportional to the “quantity” of matter—also called by Newton its “body” 
(corpus) or “mass” (massa)}—and it is said to differ from the natural “laziness” 
(inertia) of matter in our conception only. The innate force must therefore be 
regarded as an indestructible property of the matter endowed with it. 
Impressed force, on the other hand, is an action only, and does not remain in 
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the body after the action is over. “For a body perseveres in every new state by 
its force of inertia alone.”® 

In order to play the fundamental role assigned to them in Newton’s 
mathematical physics, the concepts of motion and force must be made 
quantitative. The quantity of motion is defined as the product of the velocity 
and the quantity of matter or mass of the moving body, which, as we saw, is 
equated with the innate force of inertia. A quantitative relation between the 
latter and the impressed force can therefore be derived from the Second Law of 
Motion, according to which the quantity of motion changes with time at a rate 
proportional to the motive force impressed.’ It follows that the ratio between 
the impressed and the innate force is proportional to the time rate of change of 
velocity. The magnitude of the innate force can in turn be evaluated by 
comparison with a conventional standard of unit mass, in accordance with a 
remark that follows the Third Law of Motion. The Third Law says that 
to every action or impressed force there is always opposed an equal reaction, so 
that the mutual actions of two bodies upon each other are always equal, and 
directed to contrary parts. As a consequence of this Law, “if a body impinging 
upon another body changes by its force in any way the motion of the latter, 
that body will undergo in its turn the same change, towards the contrary part, 
in its own motion, by the force of the other. [ ... ] These actions produce equal 
changes not of the velocities, but of the motions. [... ] Hence, because the 
motions are equally changed, the changes in the velocities made towards 
contrary parts are inversely proportional to the bodies” or masses, and 
consequently also to the innate forces.® In this way, the Axioms, or Laws of 
Motion—also called “hypotheses” in Newton’s unpublished manuscripts?— 
enable us to trace and even to measure the forces of nature in the light of the 
observable features of motions, that is, in the first place, of their velocity, and, 
in the second place, of the time rate at which their velocity is changing in 
magnitude and direction. 


1.2. Absolute Space 


In ordinary conversation we indicate the place of a thing by specifying its 
surroundings. It would seem that Aristotle was trying to make precise the 
conception that underlies this practice when he wrote that “place (ho topos) is 
the innermost motionless boundary of the surroundings (tou periekhontos)”.' 
Descartes criticized Aristotle’s definition because there is often no such 
“innermost motionless boundary” and yet things are not said to lack a place, as 
one can see by considering a boat that lies at anchor ina river on a windy day.” 
Nevertheless he apparently abides by the same ordinary conception, for he 
characterizes motion or change of place “in the true sense”, as “the transport of 
a part of matter, or a body, from the neighbourhood of those that touch it 
immediately, and that we regard as being at rest, to the neighbourhood of some 
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others”.? For Newton, however, “the positions, distances and local motions of 
bodies are to be referred to the parts of space”.* Space, in turn, must be 
conceived absolutely—that is, free from all ties to any bodies placed in it—as a 
boundless real immaterial medium in which every construction postulated or 
demonstrated by Euclid can be carried out. We call this entity Newtonian 
space. 

We need not dwell here on the historical and personal antecedents of 
Newton’s revolutionary approach to place and motion.° But there are two 
reasons for it that must be mentioned. In the first place, in order to elicit the 
true forces of nature from the study of motion one should presumably be able 
to distinguish true from apparent motions. To achieve this, one cannot just 
refer place and motion to contiguous bodies “that we regard as being at rest”, 
as required by Descartes’ definition, because any surroundings, no matter how 
stable they might seem, can actually be in the course of being removed from 
other, vaster surroundings—a possibility vividly brought out by Copernican 
astronomy. Let us call this first reason for Newton’s novel approach “the 
kinematical requirement”. The second reason that we must mention is that 
Newton's search for the mathematical principles of natural philosophy could 
only make sense, given the scope of 17th century mathematics, if the full set of 
points required for every conceivable Euclidean construction was allowed 
some manner of physical existence.° We may call this “the requirement of 
physical geometry”. Since Newton would not grant the infinite extension and 
divisibility of bodies, he could not admit that the said set of points was actually 
present in matter and could be recovered from it by abstraction.” He therefore 
located them in Newtonian space,® which he regarded as having “its own 
manner of existence which fits neither substances nor accidents” and 
consequently falls outside the acknowledged categories of traditional on- 
tology.” Having admitted as much in order to satisfy the requirement of 
physical geometry, it was only natural that Newton should have considered his 
“absolute”—1.e. unbound or free—space as the perfect answer to the kinema- 
tical requirement. For true or “absolute” places can now be defined as parts of 
absolute space; and the real rest and the real motion of a body as the 
continuance or variation of the absolute place it occupies. 

And yet Newton was wrong on this point. Indeed Newtonian space has been 
a perpetual source of embarrassment to scientists and philosophers not 
because it is inherently absurd—after all, it is fully characterized by an 
admittedly consistent mathematical theory—or because it cannot be defined 
“operationally”—few serious thinkers would really mind that; but because it is 
kinematically inoperative. To understand this let us observe first that 
Newtonian space does not provide any criterion by which even an omniscient 
demon could determine the position of a body in it. Newtonian space possesses 
no more structure than is required for it to be a model of Euclidean geometry 
(completed as indicated in note 6). Hence there are no distinguished points or 
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directions in it. Therefore, a body placed anywhere in Newtonian space cannot 
be described otherwise than the same body placed elsewhere. Moreover, since 
the structure of space is preserved by translations, rotations and reflections, it 
does not of itself permit the identification of points through time, unless four 
of them, at least, are singled out, say, as the vertices of a rigid tetrahedron. Since 
motion in Newtonian space can only make a difference in relation to such a 
four-point system, it is obviously meaningless to speak of the motion or rest of 
the system itself, or of any system of rigidly connected rigid bodies comprising 
it, if they are considered absolutely, in complete isolation from any other 
body.'° In the Principia Newton proposes two Gedankenexperimente for 
testing the absolute motion of an isolated system. In both of them he resorts to 
deformable bodies: a mass of water, a cord. Their deformations, interpreted in 
the light of the Laws of Motion, yield the answer to the question regarding the 
state of motion or immobility of the system. Introducing this passage Newton 
says that “the causes by which true and relative motions are distinguished one 
from another are the forces impressed upon bodies to generate motion”.!! A 
short reflection will persuade us that, if the sole grounds for ascribing 
kinematic predicates are dynamic, one cannot discern absolute rest from 
absolute motion in Newtonian physics. For in it dynamical information can 
refer only to the magnitude of the innate force or to the magnitude, direction 
and point of application of the impressed force. A given innate force acts 
equally in a body at rest or in a body moving with any velocity whatsoever— 
namely, as a resistance to acceleration; while a given impressed force effects on 
a rigid body a change of motion depending on the time during which the 
impressed force acts and on the magnitude of the innate force that resists it, but 
not on the body’s initial state of motion or rest. Hence dynamical data may 
enable one to ascertain the rate—zero or otherwise—at which a body is 
accelerating, but not the rate—zero or otherwise—at which it is moving 
uniformly in a straight line. It follows that Newtonian space cannot fulfil the 
kinematical requirement; at any rate, not in the context of Newtonian theory. 
On the other hand, its Euclidean structure constitutes the system of distances 
and angles on which the quantitative study of motion is based in this theory. 
However, as we shall see in Section 1.6, there may be found another way of 
linking that structure with physical reality which does not involve the 
postulation of absolute Newtonian space. 


1.3. Absolute Time 


Aristotle characterized time as the measure of motion.! He remarked, 
however, that “we do not only measure the motion by the time but also the 
time by the motion”.? Among all the motions that are constantly taking place 
in the Aristotelian cosmos there is one which is most appropriate for time 
reckoning, and that is the uniform circular motion (homales kuklophoria) of the 
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firmament, “because its number is the best known”. This explains why time had 
been identified (by the Pythagoreans) with the motion of the heavenly sphere, 
“for the other motions are measured by it”.*> However, Aristotle rejected this 
identification, because “if there were many heavens, the motion of any of them 
would likewise be time, so that there would be many times simultaneously 
(polloi khronoi hama)”. But this is absurd, for time is “the same everywhere at 
once (ho autos pantakhou hama)”.* 

Aristotle’s conception of a single universal time, perpetually scanned by the 
everlasting motion of the sky, but not to be confused with it, clearly presages 
Newton's “absolute, true, and mathematical time”, every moment of which is 
everywhere.* The latter, it is true, is said to ‘flow equably in itself and by its 
own nature, without relation to anything external”.® But such detachment was 
to be expected after the new astronomy has shattered to pieces the eternal all- 
encompassing aethereal dome whose daily rotation measured out Aristotelian 
time. In a Newtonian world “it is possible that there is no equable motion, 
whereby time may be accurately measured”.’ But this cannot detract from the 
reality of time itself, by reference to which the swiftness or slowness of the true 
motions of bodies can alone be estimated. 

Newtonian time, like Newtonian space, is accessible only with the assistance 
of Newton’s dynamical principles. There is one significant difference between 
space and time however. In Section 1.2 we saw that the dynamical principles 
enable us to ascertain the absolute acceleration of a body but do not provide 
any cue as to its absolute velocity. For this reason, we raised serious doubts 
regarding the physical meaning of the concept of Newtonian space. 
Newtonian time, on the other hand, is not subject to such doubts, for the First 
Law of Motion provides the paradigm of a physical process that keeps 
Newtonian time, and this is enough to ensure that the latter concept is 
physically meaningful, even if no such process can ever be exactly carried out in 
the world. According to the First Law, a body moving along a graduated ruler 
will—if both it and the ruler are free from the action of impressed forces— 
traverse equal distances in equal times. The flow of Newtonian time can 
therefore be read directly from the distance marks the body passes by as it 
moves along the ruler. C. Neumann (1870), in the first serious attempt at a 
rational reconstruction of Newtonian kinematics, used just this consequence 
of the First Law for defining a time scale.® 

The 20th-century reader will be quick to observe that our ideal Newtonian 
clock keeps time only at the place where the moving body happens to be, and 
that one is not even free to change that place at will. But this restriction does 
not seem to have worried anyone before the publication of Einstein’s 
“Electrodynamics of Moving Bodies” (1905). Apparently Neumann and other 
late 19th-century critics of mechanics tacitly assumed with Aristotle that “time 
is equally present everywhere and with everything”,? and understood, with 
Newton, that “every moment of time is diffused indivisibly throughout all 
spaces”.'° A more portable kind of Newtonian clock can be conceived in the 
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light of the Principle of the Conservation of Angular Momentum, a familiar 
consequence of the Laws of Motion.'! By this Principle a freely rotating rigid 
sphere of constant mass will indefinitely conserve the same angular velocity. 
The magnitude of the latter can be measured with a dynamometer attached to 
the surface of the body at some distance from its axis.'? We can imagine a set of 
such contraptions—which we may call spinning clocks—rotating with the 
same angular velocity. Let some of them be placed at regular intervals along 
the graduated ruler of the Newtonian clock we described first, and take the 
moving body to be another one. Let each of the clocks on the ruler be set by the 
moving clock as it passes by them. (To do this one does not have to tamper with 
the motion of the clocks; it is enough to adjust their readings.) In this way one 
could, it seems, “diffuse” the same time over all the universe. 

However, this definition of time by standard Newtonian clocks could well be 
ambiguous. Take two spinning clocks, A and B, rotating with the same angular 
velocity w (as measured by dynamometers on each), and send them along our 
graduated ruler with constant velocities v4 and vg (as measured with the clocks 
on the ruler). Let vu, 4 vg, and assume that A is initially synchronized with B, 
and also agrees with each clock on the ruler as it passes by it. Do the Laws of 
Motion constrain B to agree with every spinning clock that it meets along the 
ruler? Suppose that it does not agree with them, but that, say, it appears to run 
slower than them. In that case, of course, the angular velocity of B, measured 
with the clocks on the ruler, must be less than w. However, this does not clash 
with our assumption that it is exactly w if measured with a dynamometer affixed 
to B. To claim that such a discrepancy is absurd would beg the question. The 
local processes that measure out the time parameter of Newton’s Laws of 
Motion will unambiguously define a universal time only if they agree with each 
other transitively and stably—.e. only if the synchronism of any such process 
A witha process Bat one time and witha process C at another time ensures the 
synchronism of B with C at any time. This strong requirement is presupposed 
by the Laws of Motion as Newton and his followers understood them. 

It took the genius of Einstein to discover the problem raised in the preceding 
paragraph. Before him, the intrinsically local nature of the time kept by ideal 
and real clocks ina Newtonian world was generally ignored. After it is pointed 
out, it will seem fairly obvious to anyone who cares to reflect about it. The 
following remarks may help explain why nobody perceived it sooner. While 
the uppermost heavenly sphere was regarded as the limit of the world, the time 
reckoned by its daily rotation had an indisputable claim to universality. By 
symmetry, the rotating earth reckons the same time. In fact, the earth loses 
angular momentum through tidal friction, but a Newtonian will readily 
substitute for it a perfectly spheric and rigid spinning clock. The Copernican 
earth and its idealized Newtonian surrogate are thus legitimate heirs to the 
time-keeping role of the Aristotelian firmament, and it is not surprising that 
the men who first acknowledged them as such should have overlooked the fact 
that they can only keep time locally, not universally. 
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1.4. Rigid Frames and Coordinates. 


As we saw on page 11, stable and changing positions in a Euclidean space 
can only be identified if at least four non-coplanar points are singled out init. A 
set of four or more distinguished non-coplanar points in a Euclidean space will 
here be called a rigid frame. A rigid frame F is thus uniquely associated with 
what we may call, after Newton, a “relative space”,’ the relative space of F, 
hereafter denoted by S;. In physical applications, where the space must be 
viewed as lasting through time and as providing a place for things, a rigid frame 
can be visualized as a rigid body—or a collection of rigidly connected such 
bodies—on which four or more non-coplanar points are marked. If we assume 
that material particles do not move discontinuously, it is evident that any 
particle whose distances from three non-collinear points ofa rigid frame F are 
constant will also preserve its distance from each point of F and may therefore 
be said to be at rest in F. Asystem of particles is at rest in F if every particle of it 
is at rest in F'; otherwise it is said to move in F. If F and F’ are two rigid frames 
such that F is at restin F', Fis also at rest in F, and any particle at rest in either 
will be at rest in both. Hence, any two rigid frames such that one is at rest in the 
other can be regarded as representative parts of one and the same frame. 
Consequently, if one is given a rigid frame F one may always pick out any four 
non-coplanar points O, X,, X,, and X;3, of its relative space S;, and let them 
stand for F. Every point of S; is then uniquely associated with four non- 
negative real numbers, namely, its distances to the four points O, X,, X,, and 
X ,. However, it is more usual and more useful to label the points of the space 
Sp with (negative or non-negative) real number triples. This can be done in 
many different ways. The most familiar and in a sense the most natural way of 
doing it is by means of Cartesian coordinates. Choose O, X,, X,,and X, in S; 
so that OX,, OX, and OX, are mutually perpendicular. A point P in S¢ is 
assigned the real numbers x'(P), x7(P) and x°(P) such that |x'(P)| is the 
distance from P to the plane OX ; X, and x’(P) is positive ifand only if Pand X; 
lie to the same side of that plane (i # jf # k # i31,j,k = 1,2,3). The assignment 
P53 (x'(P), x?(P), x3(P)) is an injective (one-to-one) mapping of S, onto R?, 
which we call a Cartesian chart of S- (or a Cartesian coordinate system for the 
frame F ). We denote this chart by x; (x(P), the value of x at P,is then, of course, 
the number triple (x1(P), x?(P), x?(P)).) We say that x is based on the 
tetrahedron OX,X,X,. The mapping P++x'(P) of S, onto R is the ith 
coordinate function of the chart. Note that a Cartesian chart correlates 
neighbouring points of S; with neighbouring points of R°, and vice versa: it isa 
homeomorphism of Sz onto R?*. It is also an isometry (an isomorphism of metric 
spaces) insofar as any two congruent point-pairs in S; are mapped by it on two 
equidistant pairs of number triples in R*.2 There are many methods of 
labelling the points of a Euclidean space by real number triples, which are 
homeomorphisms of the space (or a region of it) into R*—we call them 
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generically charts (or coordinate systems)—but only Cartesian charts have the 
additional advantage of being isometries. (Later on, we shall consider spaces 
that do not have any charts with this property.) Let x and y be two charts of Sr, 
based respectively on O,X,X,X, and O,Y,Y,Y3. For simplicity’s sake we 
assume that the distances that determine the values of x and y are measured in 
the same units. The coordinate transformation y-x~ ! between chart x and chart 
y is evidently an isometry of R? onto itself. The values of x and y at an arbitrary 
point P of S; must therefore be related by a linear equation 


y(P) = Ax(P)—k (1.4.1) 


where A is an orthogonal matrix and k is a real number triple that can be 
readily seen to be equal to Ax(O,) (since y(O,) = (0, 0, 0)). The determinant 
|A| = +1. If |A|> 0, x and y are said to be similarly oriented. We note in 
passing that the coordinate transformation y: x! and its inverse x-y~! are 
continuous mappings that possess continuous partial derivatives of all orders 
(mappings of class C “). The relative space S; together with the collection of all 
its Cartesian charts, is therefore a 3-manifold, in the sense defined in the 
Appendix (pp. 257f.) 

In physical applications, in which one must be able to identify what is 
happening at different places in S, at different times, one will also associate a 
time coordinate function with the rigid frame F. The meaning of this move will 
not become clear until Section 1.6. Let us now say only that this function 
assigns real numbers to events in Sf, in such a way that simultaneous events 
are given the same number, and successive events are given successive numbers. 
We may visualize the assignment as carried out by means of Newtonian clocks 
at rest in F wherever the events concerned occur, which have been syn- 
chronized in accordance with Newton’s Laws of Motion. As noted on page 13, 
Newtonian physics presupposes that such a synchronization can be performed 
in an essentially unique way, which does not depend on the frame F. If two 
time coordinate functions ¢t and t’ have been constructed in the aforesaid 
manner, the real numbers ((E) and t’(E) that they assign to a given event E will 
be related by the linear equation ¢'(E) = at(E)+b, with a # 0. If b #0, the 
two functions have different origins (that is, they assign the number 0 to 
different sets of simultaneous events); if a+ 1, they employ different units; if 
a<0, the coordinate transformation t'-t~’ reverses time order. In the rest of 
this chapter, we always assume that all time coordinate functions associated 
with rigid frames employ the same time unit and define the same time order, 
differing at most by their choice of an origin. 


1.5. Inertial Frames and Newtonian Relativity. 


Consider now a rigid body B moving in a rigid frame F. Since B contains at 
least four non-coplanar points it isa rigid frame too. Its relative space Sg differs 
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from S;. Let P be a particle that moves uniformly in a straight line in Sp. 
Depending on how B moves in Sr, P will be at rest in B or will move in Sp, in 
different ways, uniformly or non-uniformly, describing trajectories of different 
shape and length. It follows at once that if the Laws of Motion give a true 
description of the motions of a body in one or more rigid frames they must give 
a false description of the motions of the same body in other rigid frames. One 
might remark that this was to be expected, for the Laws of Motion are 
supposed to govern absolute motions, but a motion ina rigid frame is a relative 
motion, a change of place in the relative space of that frame. We saw, however, 
that, even for someone willing to grant absolute space the peculiar manner of 
existence that Newton claims for it, the concept of absolute motion is 
completely devoid of meaning. Hence, our next task must be to characterize 
the rigid frames in which motions are adequately described by Newton’s Laws. 
We shall say that the Laws are referred to such frames. If there were only one 
frame to which the Laws of Motion are referred we could reasonably identify 
the relative space determined by it with Newton’s absolute space. But what we 
said on page 11 about the kinematical information that can be derived from 
dynamical data according to those Laws indicates that there must exist an 
infinite family of frames moving in one another’s relative spaces, to which the 
Newtonian Laws of Motion are referred. 

In his inaugural lecture on the Principles of Galileo-Newtonian (G-N) 
Theory, Cari G. Neumann pointed out that the First Law of Motion was 
unintelligible as it was stated by Newton, “because a motion which, e.g. is 
rectilinear when viewed from our earth, will appear to be curvilinear when 
viewed from the sun”. Hence, in order to give a definite sense to that Law, one 
must postulate, as the First Principle of G-N Theory, that “in some unknown 
place of the cosmic space there is an unknown body, which is an absolutely rigid 
body, a body whose shape and size are forever unchangeable”.? Neumann calls 
it “the Alpha body” (der Kdrper Alpha). He observes that its existence can be 
asserted with the same right and the same certainty with which his 
contemporaries asserted the existence of a luminiferous aether, that is, as the 
fundamental hypothesis of a successful physical theory. For those, however, 
who might feel that the Alpha body is too far-fetched, he adds that it need not 
be a material body, but may be constituted, say, by three concurrent 
geometrical lines. such as the principal axes of inertia of a non-rigid body.? The 
First Law of Motion can now be meaningfully stated in two parts: (Second 
Principle of G-N Theory) A freely moving material particle describes a path 
that is rectilinear with reference to the Alpha body; (Third Principle of G-N 
Theory) Two such material particles move so that each traverses equal 
distances while the other traverses equal distances.* The last principle 
evidently presupposes the universality of time, which Neumann, as we 
observed earlier, simply took for granted. The “Third Principle” yields 
Neumann's definition of a Newtonian time scale, to which we have already 
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referred: “Equal time intervals can now be defined as those in which a freely 
moving particle traverses equal distances [in the Alpha body’s frame].”> We 
see at once both the analogy and the difference between this definition ofa time 
scale and the postulation of the Alpha body. The former does for absolute time 
what the latter was meant to do for absolute space, that is, it provides a 
physically meaningful surrogate, or, as I would rather say, a realization of it. 
But while the uniqueness of the time scale—tacitly assumed by Neumann and 
his contemporaries—can be vindicated if the Laws of Motion are true (see 
Section 3.7), the Alpha body is necessarily non-unique unless the Laws of 
Motion are false. If « is a body that satisfies Neumann’s requirements and f is 
another body which moves uniformly in a straight line without rotation in @’s 
frame, f also satisfies Neumann’s requirements and is just as good an Alpha 
body as «. A freely moving particle will describe a straight line with constant 
velocity, and, if it is acted on by an impressed force, it will suffer the same 
change of motion, in the frame of either « or 8. Newton himself was well aware 
of this important feature of his theory.® In the 1880s James Thomson and 
Ludwig Lange based on it their independently arrived at but essentially 
equivalent rigorous restatements of the Law of Inertia. 

We shall follow Lange (18855). Like Neumann, Lange analyses the First 
Law into two parts concerning, respectively, the shape of the trajectory 
described by a freely moving particle, and the speed with which such a particle 
moves along its trajectory. In order to bestow an intelligible meaning on either 
part of the Law, Lange defines a type of rigid frame in whose relative space the 
said trajectory is rectilinear, and a type of time scale in terms of which the said 
speed is constant. Each type is appropriately labelled inertial. The full 
statement of the First Law consists of two definitions and two “theorems”. The 
latter express, in the terminology fixed by the former, the physical content of 
the First Law. The following is a paraphrase of Lange’s text. 


Definition 1. A rigid frame F ts said to be inertial if three freely moving 
particles projected non-collinearly from a given point in F describe 
straight lines in S¢. 

Theorem 1. Every freely moving particle describes a straight line in an 
inertial frame. 

Definition 2. A time scale t is said to be inertial if a single particle moving 
freely in an inertial frame F travels equal distances in S; in equal times as 
measured by ¢. 

Theorem 2. Every freely moving particle travels equal distances (in the 
relative space of an inertial frame) in equal times (measured by an inertial 
time scale).’ 


Consider two inertial frames F and F’. F must move uniformly ina straight 
line without rotation in the relative space S;. (Otherwise, a freely moving 
particle that satisfies theorems 1 and 2 in F’ would not satisfy them in F.) We 
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assume that the same time coordinate function has been associated with F and 
F'. Each point P of S; coincides at any given time ¢t with a point of S, that we 
shall denote by f(P, t). We make the Euclidean space S; into a (real, three- 
dimensional, normed, complete) vector space by choosing in it an arbitrary 
point 0 as zero vector.® We denote f(0, 0) by 0’, and choose it as the zero of S ¢. 
The mapping r++f(r, 0), re S,, is then clearly an isomorphism of the normed 
vector space (S;-, 0) onto (S_-, 0’). We denote this mapping by fo, and write fg (r) 
for f(r, 0). On the other hand, the mapping tr+/f(0, 1), t¢R, is acurve in Sp, 
whose range is the straight line described by 0 as F moves in F’. The velocity of 
0 along this line is the derivative @f(0, t)/6t, whose constant value may be 
regarded as a vector in S;..? We denote this vector by v. Evidently the constant 
velocity f(r, t)/et with which any point r of S$; moves in S;- is also equal to v. 
Hence 


f(r, t) = fo(r) + tv (1.5.1) 


Consider now a particle « of constant mass m, whose position in Sat time t 
is r(t). Let r’(t) stand for f(r(t), t), its simultaneous position in S,. Set 
F(t) = dr(t)/dt and r’(t) = dr'(t)/dt. Recalling that dfo/dr is everywhere 
equal to fo (because fo is a linear ee we see that: 


d 
N= Sfleo,9 =< (fo(r(t)) + tv) 


Wo} de 
dr |i dt], 
= fo(F(t)) + v. (1.5.2) 


In particular, if « moves with constant velocity w in F, it moves with constant 
velocity fo(w) + vin F’, where fo(w) is the copy of win S;- by the isometric linear 
mapping fo. If « is acted on by a force K at time t, and this force is represented 
in (S;, 0) and (S;, 0’) by the vectors k and k’, respectively, we have that 
(writing ¥ for dF/dt): 
d(mf) 
dt 


= mi(t). (1.5.3) 


t 


Since m is the same in F and F’, and v is constant, 


kK’ = mr'() = a (folF() +v) 
dfo dF (t) 
a dr r= W(t) dt 


= mfo (F(t) = fo(mr (t)) 
= fo(k). (1.5.4) 
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The vectorial representations of a given force of nature in different inertial 
frames, according to the Second Law of Motion, are therefore isometric 
transforms of one another.!° According to the Third Law, when « is acted on 
by K, it must instantaneously react on the source of K with a force of equal 
magnitude and opposite direction. Hence this force will be represented in 
(S;,0) by —k and in (Sp,0') by fo(—k) = —/fo(k). It follows that in 
Newtonian mechanics all inertial frames are equivalent, and that the 
Newtonian Laws are referred to such frames. The equivalence of inertial 
frames for the description of motions and the forces that cause them can 
appropriately be called the Principle of (Newtonian) Relativity. It was stated by 
Newton himself in Corollary V to the Laws of Motion: 


The motions of bodies included in a given space are the same among themselves, whether 
that space is at rest, or moves uniformly forwards in a right line without circular motion."! 


Lange’s definition of an inertial frame depends on the concept of a freely 
moving particle. It might seem that this concept is ambiguous.'? For let F and 
G be two rigid frames with the same associated time coordinate function, such 
that G moves in F in a straight line without rotation and with constant 
acceleration a. Let fo:S;— Sg assign to each point of S; the point of Sg with 
which it coincides at time 0. Let ¥ and G be two families of frames, indexed by 
Sp and Sg, respectively, such that each element F, of # moves in F with 
constant velocity v, and each element G, of Y moves in G with constant 
velocity w. A particle « at rest in Fye¥ moves in FyeF with a constant 
velocity represented in S$; by v—u,!? but accelerates in G with constant 
acceleration —f, (a); while a particle at rest in G,e Y moves in G,e 9 witha 
constant velocity represented in Sg by w —z, but accelerates in F with constant 
acceleration a. It seems that we cannot tell which of F or & consists of inertial 
frames, even if we happen to know for certain that one of them does. Indeed, 
we can look upon ¥ and Y as members of a family of families, #, indexed by 
S,, such that each member Y, of # consists of rigid frames moving uniformly 
in one another and accelerating with constant acceleration b in F. (In this 
indexing, our F is of course Yo; our Y, Y,). At first sight, it would seem that 
Newton’s theory gives us full freedom to choose any member of # to be the 
family of inertial frames. This tremendous increase in the scope of Newtonian 
relativity appears to be confirmed by Corollary VI to the Laws of Motion: 


If bodies, moved in any manner among themselves, are urged in the direction of parailel lines 
by equal accelerative forces, they will all continue to move among themselves, after the same 
manner as if they had not been urged by those forces.'* 


Such an appearance is deceptive, however. For the Third Law of Motion 
furnishes a Newtonian physicist with all the needs for distinguishing, in 
principle, between a particle acted on by a true force of nature and a free 
particle accelerating in a particular—necessarily non-inertial—frame. If a 
material particle « of mass m experiences acceleration a in an inertial frame F, it 
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will instantaneously react with force —ma on the material source of its 
acceleration.'*> There must exist therefore a material system 8, of mass m/k, 
whose centre of mass experiences in F the acceleration —ka. On the other 
hand, if a particle « accelerates in a non-inertial frame, its acceleration must 
include a component that is not matched by the acceleration of another 
material system, in direction opposite to the said component, caused by the 
action of a on that system.'® 

Though Newton’s Laws are referred to inertial frames, they can certainly be 
used for explaining and forecasting the motions of material systems in rigid 
frames of every kind. The unmatched accelerations exhibited by bodies 
moving in non-inertial frames are usually ascribed to fictitious forces, 
computed according to the Second Law. We call such forces, inertial forces. 
They include the inertial force—in the ordinary sense of the word—that shows 
up in frames that move non-uniformly in a straight line (such as a bus or an 
underground train) and the centrifugal, Coriolis and d’Alembert forces, that 
show up in rotating frames.” All inertial forces have the following properties: 
(i)they cannot be traced to material sources on which the particle or body 
acted on by them is currently reacting; (ii) they act on each material object in 
proportion to its mass, independently of its state or composition; and (ili) they 
cannot be shielded off by insulators. All three properties are naturally to be 
expected in fictitious forces, that are postulated as the putative cause of 
accelerations arising from the state of motion of the frame in which they occur. 
But only the first property singles out a force as fictitious. Properties (ii) and 
(iii) are shared also by the force of gravity, which Newton, while confessing 
that he was unable to understand it, held to be very real indeed. 


1.6 Newtonian Spacetime 


As we shall see in Chapter 3, the concept of an inertial frame played an essen- 
tial role in Einstein's original statement of Special Relativity in 1905. Einstein 
showed that if, as experience suggests, the propagation of electromagnetic 
signals in vacuo in the relative space of an inertial frame does not depend on the 
state of motion of their source, the physical equivalence of all inertial frames 
can only be maintained by ascribing to each such frame a different time, 
entailing a different partition of physical events into classes of simultaneous 
events. In 1907, Hermann Minkowski showed that Special Relativity can be 
restated in a more elegant and perspicuous, and, as proved by the eventual 
development of Einstein's theory of gravitation, remarkably fruitful manner, 
by postulating a single, metrically structured, four-dimensional manifold, 
which Minkowski significantly called “the world”, but which is now generally 
knownas Minkowski spacetime. The relative space and the relative time of each 
inertial frame can then be readily obtained, respectively, by orthogonal 
projection of the manifold onto a hyperplane of a particular kind and onto an 
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axis perpendicular to that hyperplane. Subsequently, Elie Cartan 
(1923/24/25) and other writers’ showed how to construct what may 
appropriately be called Newtonian spacetimes, that is, four-dimensional 
manifolds suitably structured for the reformulation of Newtonian dynamics in 
a style analogous to that of Minkowski’s statement of Special Relativity. 
Though the main purpose of these constructions was to facilitate the 
comparison of Newton’s gravitational theory with Einstein’s, they can be 
easily adapted to a situation in which matter is so scarce and is so sparsely 
distributed that gravitational effects can be ignored. As Earman and Friedman 
(1973) have ably shown, such an empty Newtonian spacetime provides a much 
more natural setting than Lange’s inertial frames for describing and under- 
standing the behaviour of Newtonian test particles moving freely or under the 
action of impressed forces. 

Philosophical attitudes toward the concept of spacetime are often con- 
ditioned by the idea that it is unintuitive and hence—it is believed—unnatural. 
Such a “formal” mathematical artifact may be —it is thought—of great help in 
calculations, but it cannot be regarded as a proper, even if admittedly 
simplified and “idealized”, representation of a physical reality, in the manner in 
which Newtonian space, or, at any rate, the relative spaces of Lange’s frames, 
are readily allowed to be. However, unless you unwittingly equate the 
“intuitive” with the intellectually familiar, you will surely acknowledge that the 
infinitely extended Euclidean space, with its infinitely articulated net of 
depthless planes and widthless lines, meeting at dimensionless points, is not a 
whit more intuitive than, say, Minkowski spacetime.” Even if you believe that 
the idea of a region in space can be obtained by abstraction and idealization 
from our direct grasp, say, of the interior of a room, you must admit that this 
idea is one step further removed from reality as experienced than the idea of a 
spacetime region. For we perceive the interior of a room for a lapse of time, no 
matter how short, during which even the quietest room will exhibit sound 
patterns-—an insect’s flutter near the upper right-hand corner in front, the hum 
of traffic beneath the window to the left—-and other manifestations of 
becoming. If, by an operation of abstractive idealization, we ignore all 
qualitative differences in the room and form the conception of an homoge- 
neous continuum underlying them, this continuum must be four-dimensional, 
since it must provide locations where the perceived patterns may not only 
stand beside each other, but also succeed one another. The conception of the 
three-dimensional continuum of space can only be arrived at by this method of 
abstraction through a second operation, by which one ignores even the 
unqualified possibility of change. 

Though some kind of four-dimensional continuity underlies all our 
representations of physical objects, an exact concept of a spacetime manifold 
was not achieved before Minkowski. In Minkowski spacetime we must 
distinguish (i)the underlying four-dimensional differentiable manifold, dif- 
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feomorphic with R*, and (ii) the peculiar affine and metric structure bestowed 
on that manifold in consonance with Special Relativity. Here we need only 
busy ourselves with (i), which is shared by Minkowski spacetime and the 
Newtonian spacetime that is the subject of the present section. Though the 
notion of an n-manifold (see page 258) was first conceived by Bernhard 
Riemann in the 1850s, one can now, with the benefit of hindsight, reasonably 
maintain that spacetime was already handled as a 4-manifold when Galilei 
plotted spaces against times in his discussion of the motion of projectiles.? The 
development of coordinate geometry by Descartes and his successors fur- 
thered the intellectual habits and furnished the conceptual tools that make for 
this interpretation of Galilei’s procedure. We find time explicitly referred to as 
a fourth dimension of the world in some 18th century texts.* However this idea 
received little or no attention during the 19th century, while n-dimensional 
geometry became increasingly fashionable among mathematicians.> The 
neglect of the mathematical concept of spacetime was due perhaps to the fact 
that until late in the 19th century mathematicians did not deal with the 
topological and differentiable structures of R"” as with something distinct from 
and more fundamental than its standard metric structure. For unless one 
“forgets” the Euclidean metric the analogy of R* is more harmful than helpful 
to the idea of a spacetime manifold.° 

Perceived continua can be conceived as n-manifolds only if they are suitably 
idealized.’ But the concept of spacetime as a 4-manifold does not depend on 
more extreme idealizations than the standard mathematical representations of 
physical space and time. If physical locations and durations are idealized, 
respectively, as points and instants, one can without more ado conceive an 
instantaneous occurrence at a point, such as the durationless collision of two 
dimensionless particles. Such ideal occurrences are generally called point- 
events, or simply events. If we do not admit genuine vacua in the world and we 
regard the mere continuance of a physical state as being in some sense an 
occurrence, we shall take the collection of all events as the underlying set of the 
spacetime manifold. However, if in theoretical discussions we wish to 
countenance an empty or almost empty world, we must conceive the 
underlying set of our spacetime manifold not as the collection of all physical 
events but as the collection of all punctual instantaneous locations available 
for them. This is the set of “possible events” often mentioned by physicists. It 
should be clear at once that a “possible event” in this peculiar sense is not quite 
the same sort of thing that is ordinarily designated by this expression. Whoever 
allows the possible to extend beyond the actual will also grant that several 
mutually incompatible events might alternatively occur at a given time and 
place. All these possible events furnish but a single argument to a space-time 
coordinate system, and must consequently be fused into a single “possible 
event” in the physicists’ special sense. The “possible events” that make up the 
physicists’ spacetime manifold are therefore in effect equivalence classes of 
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(idealized, i.e. instantaneous and punctual) possible events in the ordinary 
sense. The equivalence that defines them is none other than the identity of 
spatio-temporal location. 

Though the notion of a “possible event” is even more remote from perceived 
reality than that of a real but instantaneous occurrence at a point, it is already 
implicit in the kinematical concepts developed in the foregoing sections. Let F 
and F’ be two inertial frames. At each instant t, each point r of the relative space 
S, coincides with a point r’ of the relative space Sp. The instantaneous 
coincidence of rand r’ specifies a “possible event” in the above sense. It is the 
locus of a real one if, for instance, there are particles « and a’, at rest, 
respectively, in r and r’, during a time interval ending at t. Let X be a Cartesian 
coordinate system for F, with coordinate functions x, x? and x?. Let x° 
denote the Newtonian time coordinate function, ranging over all R, associated 
with F and F’. Hereafter, unless otherwise noted, small Latin indices i, j,k, 
... will range over 0, 1, 2,3, while small Greek indices, «, 8B, y,... range over 
1, 2,3. The four functions x! coordinate a different number quadruple with 
each coincidence ofa point of S; witha point of S;-, thereby defining a bijective 
mapping of the set of all “possible events” onto R*. Let x denote this mapping. 
Let x! = 2,-x (where 7; is the i-th projection function of R*, which sends each 
real number quadruple to its (i+ 1)-th element). We see at once that the 
“possible events” mapped by x* on a given real number are precisely the 
coincidences involving the points of S; mapped by x* on the same real 
number. We call the three mappings x’ the space coordinate functions of x. On 
the other hand, x° and X° are in fact the same function, for they are both 
defined on “possible events” and they map on a given real number precisely the 
same set of simultaneous events.® We call x° the time coordinate function of x. It 
is clear that our intuitions of spatiotemporal vicinity will be properly taken 
care of if the set of “possible events” is endowed with the weakest topology that 
makes x into a continuous mapping. Let / denote the set of “possible events”, 
with this topology. x is then a global chart of M and {x} is an atlas of ./, in the 
sense defined on page 257. With the differentiable structure determined by this 
atlas, is a 4-manifold, diffeomorphic with R* (see p. 258). . is indeed the 
manifold underlying both Newtonian and Minkowski spacetime to which I 
referred under (i) on page 21. We shall henceforth designate the points of 
—or of any other spacetime manifold—by the name of worldpoints, used by 
Minkowski, which is less clumsy and inaccurate than “possible events”. A 
regular smooth or piecewise smooth curve in .@ or the path defined by it (pp. 
259, 261) is called a worldline. 

A chart of 4 constructed like x from a Newtonian time coordinate function 
and a Cartesian coordinate system for a given inertial frame, will be called a 
Galilei chart, adapted to that frame. Through the study of Galilei charts and 
their coordinate transformations we shall arrive at a characterization of the 
specific structure of Newtonian spacetime. We prove, first of all, that if x and y 
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are two Galilei charts adapted to the inertial frames, F and G, respectively, the 
coordinate transformations x-y~1 and y-x~! are linear permutations of R* 
(i.e. linear bijections of R* onto itself). Since a change of scale amounts to a 
linear permutation, we can assume, without loss of generality, that the same 
units of time and length are used in constructing x and y. Let O be the 
worldpoint such that x(O) = (0,0,0,0), and let y(O) = (k°,k!,k?, k?). There 
exists a Galilei chart z, adapted to G, such that 


zi = yi—ki (1.6.1) 


It is clear that z(O) = x(O) and that z° = x° on all of 4. Consider the 
worldpoints P,, such that x(P,) =(1,0,0,0),  x(P,) = (0,1,0,0), 
x(P,) = (0,0,1,0) and x(P3) = (0,0,0,1), and let 2(P,) = (1,0,,02,03) and 
2(P,) = (0,444; 42a, 43q). The meaning of the twelve numbers v,, a,g, is not hard 
to see. Let X and Z denote, as before, the Cartesian systems for F and G from 
which x and z are built. O can be regarded as the coincidence of a point o in S; 
with a point o’ in S¢. Po is the coincidence of the same point o witha point 0” in 
Sg. Hence F moves in G with constant velocity in the direction from o’ to o”. If 
Z,(t) is the value of Zat the point of Sg with which the point p of S; coincides at 
time t, it is clear that dz,/dz° = (v,,v2,v3). The v, are therefore the velocity 
components of F in the Cartesian system z. The mapping x(p)++Z,(0) is a 
rotation or a rotation-cum-parity of R*, with matrix (a,,).” The matrix (a,,) is 
therefore orthogonal and has no more than three independent components. '° 
It is now easy to calculate the values of the coordinate functions z' at an 
arbitrary worldpoint P if the values of the x! at P are given. Let P be the 
coincidence of pe S; with p’e Sg. At the time of O, while o coincides with o’, p 
coincides with p” € Sg. Noting that the time interval between O and P is given 
by 2°(P) —z°(O) = x°(P), it will be seen that z*(p”) = Z*(p’) —v,x°(P). Since 
2%(p") = Zga,gX*(p), it follows that Z*(p') = Lga,,xX*(p) + v,x°(P). Con- 
sequently, the coordinate transformation z: x~ ' is governed by the equations: 

29 = x? 

2* = Leaagx? +0,x° fo 
Substituting from equations (1.6.1) into (1.6.2) we find the general form of the 
equations governing the coordinate transformation yx ', where x and y are 
any two Galilei charts constructed using the same units of time and length: 


et: (1.6.3) 
y* = Lea gx? +u,x° +k, i 


y:x7! is therefore a linear permutation of R*. The same is true of its inverse 
xp, 

To avoid useless complications we shall henceforth consider a represen- 
tative collection of Galilei charts, which we call the Galilei atlas Ay. Agconsists 
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of an arbitrarily chosen Galilei chart x and of all Galilei charts y related to x by 
equations of the form (1.6.3) in which the orthogonal matrix (a,,) has a positive 
determinant. This implies that (i) all charts of Ag involve the same units of 
length and time; (ii) their time coordinate functions increase in the same 
temporal direction, and (iii) at any given time, the space coordinate functions 
of two charts in A g, adapted, say, to the frames F and G, can be made to agree 
on coincident points by subjecting either frame to a translation followed by a 
rotation in the other frame (no spatial reflection is involved). Let x be any chart 
in A,, adapted to an inertial frame F. x(.# ) is then R*, which we consider to be 
endowed with the standard Euclidean metric (Sect. 1.4, n. 2). We wish to 
examine some geometrical features of x. If T is the fibre (note 8) of the time 
coordinate function x° over one of its values, x(7) is a hyperplane in R*, 
consisting of all the real number quadruples whose first component is the said 
value of x°. This hyperplane is isometric with R? and is therefore a copy of 
Euclidean space. The set of all such hyperplanes, x) = t, te R, is ordered by the 
parameter t. If s is the collection of all “possible events” in the history of a 
particle o, x(s) is a line that meets each hyperplane m9 = const., once and once 
only. Ifo is motionless in F, x(s) is a straight line orthogonal to the hyperplanes 
No) = const. If o moves uniformly in F with speed v, x(s) is a straight line which 
forms an angle equal to arc tanv with the direction orthogonal to the 
hyperplanes mo = const. Let y be a chart in Ag adapted to a frame G. The 
mapping y': x! is a linear permutation of R*. We call it a Galilei coordinate 
transformation. How does R* fare under such a transformation? A few facts 
deserve our attention. (1) Each hyperplane a) = const. is mapped isometri- 
cally, without reflection, onto a hyperplane parallel to it.!! (2) The transform- 
ation preserves the order of the hyperplanes mz) = t, t€ R, and the distance, in 
R*, between any two of them. (3) The family of parallel lines orthogonal to the 
hyperplanes 7) = const. is mapped onto a family of parallels tilted with respect 
to the former by an angle «, less than 2/2; tana is the speed of F in G. (4) Asa 
consequence of (1) and (3), all parallels are mapped onto parallels: the 
transformation is affine. It is not difficult to see that any permutation of R* that 
satisfies these four conditions is a Galilei coordinate transformation between 
some pair of charts in Ay. Let Gp be the set of all such permutations. Gp 
includes the identity mapping. If 0 belongs to Yp, the inverse permutation 07? 
also belongs to Gp. Moreover, if # and 6’ belong to Gp, their product 0: 0" = 6” 
is affine and satisfies (1) and (2); it also satisfies (3) because, if A is perpendicular 
to the hyperplanes z, = const., the directions of 4 and @”(A) cannot make a 
right angle if @”, which is bijective, satisfies (1), nor an obtuse angle if @” satisfies 
(2) and hence preserves time order. Gp is therefore a group of permutations of 
R*. 

The mapping f: x-y"'+»x7'-y(x,ye€Ag) assigns to each Galilei coor- 
dinate transformation of R* a permutation of .#, which we call a Galilei point 
transformation. Let Gy be the range of f. Since f is bijective and maps the 
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product of two elements of Gg on the product of their images in 4 y, Gy is 
evidently a group of permutations and fis a group isomorphism. Both Gp and 
G g are realizations of the same abstract Lie group, known as the proper 
orthochronous Galilei group. We denote it by Y.!2 Though we have defined it 
by means of its realization as a group of permutations of R*, its structure can 
be conceived apart from all realizations. Though we have used Lange inertial 
frames and Newtonian time coordinates as a historically justified prop for 
explaining the significance of the Galilei group Y, we do not need this 
scaffolding for defining it. This can be better understood if I first introduce the 
concept of the action of a group on a set, which we shall use repeatedly in 
this book. A group # is said to act on a set S if there is a surjective mapping 
F: #xS—S such that, for every g, he #’ and every seS, F(gh,s) 
= F(g, F(h, s)). F is called the action of #on S. It is said to be transitive if for 
every s, s’€S there is an he #, such that s’ = F(h, s). It is said to be effective if 
F(h, s) = s, for every seS, only if h is the neutral element of #. We write 
hereafter hs instead of F (h, s).!3 The mapping s hs (se S), for a fixed he #, 
is clearly a permutation or transformation of S. Such transformations form a 
group, isomorphic with #, the realization of # by its action F. If # is a Lie 
group and S an n-manifold and F is sufficiently smooth (p. 258), we say that # 
acts on S as a Lie group of transformations. Each transformation s++hs is 
evidently a diffeomorphism (p. 258). By means of the concept of group action 
we can express the structure of ¢ in terms of that of other known groups. Since 
groups are structured sets they can act on each other. The semidirect product 
H x KH ofa Lie group # bya Lie group ¥% on which the former acts is the 
product manifold # x # endowed with the group product: (h, k)(h’, k’) 
= (hh, khk’). If O* (3) denotes the multiplicative group of 3 x 3 orthogonal 
matrices with positive determinant, and 7, is the underlying additive group of 
the vector space R", 4 is the semidirect product (O* (3) x T,) x T,—the 
actions involved being constructed by ordinary matrix multiplication. '* 

Let # and S be as above. An n-place relation R on S is said to be invariant 
under #, or #H-invariant if, for every n-tuple (s,,...,5,) of points of S and 
every he W, whenever (s,,... , s,) have the relation R, (hs,,... , hs,)also have 
it. (View properties as “one-place relations”.) This definition is readily 
extended to mappings of S” into any set; e.g., f: S? > S’ is #-invariant if, for 
any s,,8,€S and any heH#, f(s,,5.)=f(hs,,hs,). In the Erlangen 
programme, Felix Klein (1872) successfully employed the concept of invari- 
ance under a group for unifying in an orderly system most of the seemingly 
disparate branches of 19th century geometry. In Klein’s system, a geometry is 
induced—or should we say disclosed?—in a manifold by a group that acts on 
it. The geometric structure is constituted by the group invariants. From this 
point of view, we define the Newtonian structure—in the absence of matter— 
of the manifold .@ diffeomorphic with R*, by the properties and relations of 
A that are invariant under $—or more precisely, under the action of Y on. 4 


NEWTONIAN PRINCIPLES 27 


that realizes Y by means of Gy, the group of Galilei point transformations 
introduced above. A rigorous analytical deduction of the Y-invariants is out of 
the question here, but a quick glance at our earlier description of the action of 
the Galilei group on R* should persuade us that they include the properties, 
relations and functions listed below. (As usual, x stands for an arbitrary chart 
of Ag, and x° for its time coordinate function.) 


A, G-invariant relations between worldpoints 


(i) Two worldpoints P, and P, are simultaneous if and only if 
x°(P,) = x°(P,). (Hence, simultaneity is an equivalence, i.e. a sym- 
metric, reflexive and transitive relation.) 

(li) P, follows P, (and P, precedes P,) if and only if x°(P,) > x°(P,). 

(ili) Three worldpoints P,, P, and P, are collinear if and only if x(P,), 
x(P,) and x(P3) lie on the same straight line in R*. 

(iv) Four worldpoints P,,...,P4 are coplanar if and only if x(P,), 
...,X(P4) lie on the same plane in R*. 


B. G-invariant properties and relations of worldlines 


(i) A worldline A is spacelike if and only if all worldpoints on A are 

simultaneous. 

(ii) A is timelike if and only if no two worldpoints on 4 are simultaneous. 

(ili) Ais straight if and only if any two worldpoints on 4 are collinear with 
every other worldpoint on J. 

(iv) Two straights 4 and pare coplanar if and only if any two worldpoints 
on A are coplanar with any two worldpoints on uy. 

(v) Two coplanar straights A and y are parallel if and only if there does 
not exist a worldpoint P that ts collinear with two points of A and is 
also collinear with two points of y. 


C. G-invariant functions 


(1) The temporal distance t(P,, P,) between two worldpoints P, and P, 
is equal to the distance in R*, between the two hyperplanes 
To = x°(P,) and mo = x°(P2). 

(ii) The spatial distance o(P,,P,) between two simultaneous 
worldpoints P, and P, is equal to the distance in R* between x(P,) 
and x(P3). 


The distance function o evidently satisfies the laws of Euclidean geometry 
on each set of mutually simultaneous worldpoints. It can be used for defining 
the length of spacelike straights (viz., as the spatial distance between their 
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endpoints). On the other hand, since Yp is a group of linear but generally non- 
isometric transformations of R*, there is no G-invariant distance function 
between arbitrary worldpoints.'> A G-invariant angle measure can only be 
defined for concurrent spacelike straights. The distance between two spacelike 
worldlines can be readily defined as in ordinary geometry if their points are 
simultaneous, but it makes little sense to speak of the distance between non- 
spacelike worldlines. However, we can define equidistance between two 
timelike worldlines 4 and y if for for every point P on 4 there is a point Q on 
which is simultaneous with P. If this condition is met, we say that 4 and y are 
equidistant if and only if the spatial distance between each pair of simultaneous 
worldpoints in A Uy is the same. (This constant spatial distance is then the 
distance between 4 and y.) It is not hard to see that two timelike straights 
whose union consists of pairwise simultaneous points are equidistant if and 
only if they are parallel. 

Newtonian spacetime, as characterized above, is not a metric space. But it is 
what we shall call an affine space (p. 91), and its system of straights agrees 
with that of four-dimensional Euclidean space. We have characterized the 4- 
invariants of Newtonian spacetime by means of an arbitrary Galilei chart, but 
they can also be given an intrinsic or “coordinate free” definition.'® The 
foregoing characterization can then be used for defining a Galilei chart. 

The geometry of empty Newtonian spacetime throws much light on the 
conceptual framework developed in the earlier sections of this chapter. A 
congruence of curves ina manifold S is a set of curves in S such that each point 
of S lies on the range of one and only one curve of the set. A frame in 
Newtonian spacetime can be defined as a congruence of timelike curves. A 
frame F is rigid if each pair of curves in F are equidistant. A rigid frame is 
inertial if it isa congruence of straights. We see at once how each rigid frame is 
associated with a copy of Euclidean 3-space. A congruence of curves in a 
manifold is a partition of the manifold and thereby defines a relation of 
equivalence between the points of the latter. Let f be the equivalence defined 
on .@ by the rigid frame F. (Two worldpoints are equivalent by fif and only if 
they lie on the same curve of the congruence F.) Sy, the relative space of F, is 
none other than the quotient manifold .“/f, whose points are the curves of F. 
Since the distance between two of these curves is the spatial distance between 
any two simultaneous points of them, the quotient space .“/f is a copy of 
Euclidean space. There is no longer any mystery about the “mutual penet- 
ration” of the relative spaces of two rigid frames, if we conceive them as 
different quotient manifolds of the same spacetime .@ defined by different 
partitions of its worldpoints. Let s denote simultaneity; s is an equivalence. The 
quotient manifold #/s is diffeomorphic with R. We take .#/s to be what we 
mean by Newtonian time. We make the latter, elusive concept, perspicuous by 
identifying it with the former, which is clear and distinct. It is immediately clear 
in what sense Newtonian time is universal: namely, because each instant ts an 
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equivalence class of simultaneous “possible events” at all points of the relative 
space of any frame. And why it is absolute: namely, because the partition of 
worldpoints into “simultaneity classes” does not depend on the choice of a 
frame. 

The connection between Galilei charts and inertial frames can now be 
described as follows: If x is a Galilei chart, the parametric lines of the time 
coordinate function x° (i.e. the curves in .@ along which x° varies while the 
space coordinate functions x* take constant values) constitute a congruence of 
timelike straights, or, in other words, an inertial frame. Call it X. Let 
p: (a,b,c, d)++(b,c,d) be the projection of R* onto R? defined by “ignoring” 
the first element of each real number quadruple, and let x stand for p: x. The 
fibre of X over a given real number triple is then a parametric line of x°, that is,a 
point of the relative space of the inertial frame X. X is evidently a Cartesian 
coordinate system for X. If y is another Galilei chart, related to x by equations 
(1.6.3) with v, = v, = v, = 0 the parametric lines of y° constitute the same 
inertial frame X. Let us say that two Galilei charts are frame equivalent if they 
thus define the same inertial frame. If x and y are frame inequivalent charts of 
Ag one can always find a chart z, frame equivalent to y and related to x by 
equations (1.6.2), with (a,g) = (64g), the 3 x 3 identity matrix. We call x: za 
purely kinematic(pk) Galilei coordinate transformation. The transformation 
depends only on the three parametres v,, which are the velocity components of 
the frame defined by y and z, in the frame defined by x, along the three 
directions determined by the parametric lines of the space coordinate 
functions x*. The pk transformations constitute a subgroup of Y. For if x, y 
and z are frame inequivalent charts of Ag, such that y-x~! and z-y~' are pk 
Galilet transformations governed by 


yo = x? 2° = y? 


. 3 as ‘ (1.6.4) 
yt = x%+0,x z* = yi +wy®, 


it follows that z-x~! is also a pk Galilei transformation governed by 


gS 


1.6.5 
z* = x*4 (v, +w,)x°. ( ) 


Writing r for the number triple (r,, r,,r3) and x, for the coordinate velocity 
dx*/dx°, it is clear that a particle at rest in the frame defined by x moves with 
velocity y = v in the frame defined by y and with velocity z = v + w in the 
frame defined by z. This is the Galilei Rule for the Transformation of Velocities. 

A material particle is represented in Newtonian spacetime by the set of all 
“possible events” in its history, that is by a timelike worldline. A free particle 
has a straight worldline: this is the geometric contents of Newton’s First Law. 
That of the Second Law is the following: Under the influence of an impressed 
force the worldline of a particle deviates from its direction at each worldpoint in 
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the spacelike direction in which the force is acting, at a time rate directly 
proportional to that force and inversely proportional to the particle's mass. The 
difference between a free or inertial and a non-inertial or forced motion is 
therefore analogous to that between straightness and crookedness—an 
analogy that Newton himself suggested in a note of his “Waste Book”.'” As we 
saw on page 18, the Third Law can teach us, in principle, what particles are free, 
and hence, what timelike worldlines are straight. It will be noticed that the 
Newtonian concept of an impressed force, endowed with a direction in space, 
depends essentially on the existence of an absolute time, that unambiguously 
determines the simultaneity class of any given worldpoint, and hence the set of 
spacelike directions at the latter. 

The Newtonian Principle of Relativity, which proclaims the dynamical 
equivalence of all inertial frames, is so to speak built into the structure of 
Newtonian space-time. Its implications for Newton’s programme of natural 
philosophy and its physical significance become clearer in this perspective. 
Consider an isolated system of mutually interacting material particles. The 
history of the system between two instants of time is represented in Newtonian 
spacetime by a configuration S of timelike worldlines, one for each particle in 
the system, whose directions are variously changing under the influence of the 
forces with which the particles act on each other. According to Newton’s 
programme (Section 1.1), the laws that govern the operation and evolution of 
natural forces should be ascertainable in the light of the phenomena of motion, 
that is, by studying such configurations as S. If t is a Galilei point 
transformation, Newtonian relativity implies that tS, the image of S by t, is 
dynamically indistinguishable from S. S and tS are therefore equally good 
representations of our isolated system, or of physically equivalent isolated 
systems, and the study of S or of tS must lead to the discovery of exactly the 
same laws. Consequently, Newtonian relativity demands that the operation 
and evolution of the forces of nature depend only on the Y-invariant features 
of the spacetime representations of material systems and on intrinsic properties 
of material objects—such as mass, electric charge, etc—which must be deemed 
to be Y-invariant as well. Consider now the image x(S) of S by a Galilei chart x. 
x(S) provides a full description of the shape of S by means of real number 
quadruples and their relations. Another Galilei chart y yields a different 
description y(S). But the laws disclosed by the study of S must hold equally 
well for either description. Suppose that the shape of S is determined by a set of 
dynamical laws L together with initial and boundary conditions C(S). If C,,(S) 
denotes the description of the latter in terms of the chart x, there must be a set 
of equations L,, involving the coordinate functions of x and the functions 
representing in R* the spacetime distribution of the relevant intrinsic 
properties of matter, such that the shape of x(S) can be derived from L, 
together with C,(S). Obviously, the equations L, mus be regarded as the 
expression of LZ in terms of x. If the functions involved in L, are subjected to 
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the Galilei coordinate transformation y- x~ 1, one should obtain another set of 
equations L, which, together with the conditions C,(S), will determine the shape 
of y(S). Moreover, since the laws L must only concern Y-invariant properties 
and relations, their expressions L, and L, in terms of the charts x and y cannot 
involve traits peculiar to the representation of S by either chart. Hence, one 
concludes, L, and L, must be equations of the same “form”, that can be 
obtained from one another by substitution of variables. These abstract and 
somewhat imprecise considerations will be illustrated by an important 
example in Section 1.7. But before passing to it I wish to state one more 
consequence of the foregoing, which epitomizes the physical meaning of 
Newtonian relativity. Take again the physical process we have represented by 
the spacetime configuration S, determined by laws L and conditions C(S). The 
Galilei point transformation t maps S onto 1S, determined by L and 
conditions C(tS). If t = y~!-x, where y and x are suitable Galilei charts, x(S) 
and y(tS) provide identical numerical descriptions of S and of 7S, respectively. 
Evidently, the conditions C,(S), from which, together with L,, one derives 
x(S), are the same set of numbers and numerical relations as the conditions 
C,(tS), from which, together with L,, one derives y(tS). Consequently, if the 
laws of nature satisfy the requirements of Newtonian relativity, two physical 
processes, whose determining conditions can be described in the same way in 
terms of two Galilei charts (of our atlas Ag), will receive, through their entire 
development, the same description in terms of those two charts. As Galilei 
himself expressed it, with unrivalled clarity: 


Shut yourself up with some friend in the main cabin below decks on some large ship, and have 
with you there some flies, butterflies, and other small flying animals. Have a large bowl of 
water with some fish in it; hang up a bottle that empties drop by drop into a wide vessel 
beneath it. With the ship standing still, observe carefully how the little animals fly with equal 
speed to all sides of the cabin. The fish swim indifferently in all directions; the drops fall into 
the vessel beneath; and, in throwing something to your friend, you need throw it no more 
strongly in one direction than another, the distances being equal; jumping with your feet 
together, you pass equal spaces in every direction. When you have observed all these things 
carefully (though there is no doubt that when the ship is standing still everything must 
happen in this way), have the ship proceed with any speed you like, so long as the motion is 
uniform and not fluctuating this way and that. You will discover not the least change in all 
the effects named, nor could you tell from any of them whether the ship was moving or 
standing still.'® 


1.7 Gravitation 


Newton’s greatest contribution to the fulfilment of his philosophical 
programme was the law of universal gravitation. With it he unified, boldly and 
successfully, two of the most conspicuous regularities of nature, free fall and 
planetary motion. Although we no longer consider Newton’s formula as 
perfectly accurate and we reject the notion of force that underlies it, today 
such phenomena can be only conceived of, regardless of their apparent 
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disparateness, as instances of the same general law. (Philosophers who tend to 
underrate the permanence of scientific achievements would do well to reflect 
on this fact.) 

According to Newton’s Law of Gravity, a particle of mass m,, at a distance r 
from a particle of mass m,, suffers the action of an impressed force directed 
toward the second particle and proportional to m, m,/r?. By the Third Law of 
Motion, the second particle must then be acted on by a force of the same 
magnitude and opposite direction. It is clear therefore that the susceptibility of 
a particle to gravitational force—what we may call its “passive gravitational 
mass”—must be proportional to its power of exerting such a force—its “active 
gravitational mass”. On the other hand, it is a remarkable coincidence that 
these properties of matter should be measured by the same quantity that 
determines matter’s “innate force” or resistance to acceleration—what we have 
hitherto called, with Newton, its “mass” and shall henceforth call its “inertial 
mass”. Newton was well aware that the proportionality between inertial mass 
and passive gravitational mass, or, as he puts it, between the “quantity of 
matter” and its “weight”, was not a conceptual necessity. He claimed to have 
established it experimentally, with an accuracy of 1 part in 1000, by means of 
pendulums “perfectly equal in weight and figure”, filled with equal weights of 
“gold, silver, lead, glass, sand, common salt, wood, water, and wheat”. 

Let y be a Galilei chart, and let y#(t) be the space coordinates of a particle j, of 
mass m,, at time ¢. In terms of y, the force impressed on j at time t can be 
represented by a vector in R®*, with components F*(t) = m,d? y3(t)/(dy°)’. If 
we use bold face, as on page 29, to denote vectors in R3, and a dot or two over a 
function to designate its first or second time derivative, the gravitational force 
on particle 1, of mass m,, in the presence of particle 2, of mass m,, is given by 
either side of the equation: 

mm 
ly: —yel* 
where G, the gravitational constant, is approximately equal to 
6.67:10°'' m3 kg~!s~?. Let x be another Galilei chart related to y by 
equations (1.6.3). Let A denote the matrix of the parameters a,,. Since A is 
orthogonal, | Ar| = |r|, for any vector r in R*. Substituting x,, x, and x, in 
(1.7.1), in accordance with (1.6.3), we have that 


my, = —G (¥; —Y2), (1.7.1) 


mm, 


m,AX, = —G A(x, —X2). (1.7.2) 


|x; —x2|° 
Multiplying both sides on the left by 4~', we obtain: 
(x, —X). (1.7.3) 


We have here an example illustrating the general ideas presented on page 30. 
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The Law of Gravity, expressed in terms of the chart y by equations (1.7.1), is 
expressed in terms of the chart x by equations (1.7.3). The latter are obtained 
from the former via the Galilei coordinate transformation between x and y, or 
directly by substitution.” Dividing both members of either set of equations by 
m,, we find that the gravitational acceleration of a particle does not depend on 
its mass. This fact, noted in the case of falling bodies by Galilei, is thus 
explained by Newton as a consequence of the proportionality of inertial and 
gravitational mass. 

Newton’s Law of Gravity was ill-received by many of his contemporaries 
because it postulates instantaneous interaction between distant bodies. As the 
Earth revolves around the Sun, the opposite and equal forces which they exert 
on each other adjust continually and without delay to the varying distance 
between them. In the eyes of Huygens or Leibniz, to allow direct action at a 
distance was to return to the pre-Cartesian natural philosophy of occult, magic 
powers. Newton himself rejected the view of gravity as “innate, inherent, and 
essential to matter”. He wrote to Bentley: 


That one body may act upon another at a distance through a vacuum, without the mediation 
of anything else, by and through which their action and force may be conveyed from one to 
another, is to me so great an absurdity that I believe no man who has in philosophical 
matters a competent faculty of thinking can ever fall into it.? 


Gravitational interaction requires therefore a cause besides the mere presence 
of the interacting bodies. But Newton would frame no hypotheses concerning 
it. To him it was enough that “gravity does really exist, and act according to the 
laws” he had stated.* However, during the 18th century, the overwhelming 
success of Newton’s theory and the growing awareness that action by contact 
is not a whit more intelligible than action at a distance, led scientists and 
philosophers to regard Newtonian gravity as a paradigm of interaction 
between bodies, and to explain even impenetrability, pressure and collisions by 
central repulsive—instead of attractive—forces, acting from afar.* 

New mathematical techniques originating also in the 18th century made it 
possible to reformulate Newton’s Theory of Gravitation as a field theory. A 
given distribution of matter determines a scalar field, ie. a Galilei invariant 
real-valued function, the gravitational potential, on all space. The potential ® 
determined by a matter distribution of density p can be calculated by 
integrating Poisson’s equation: 


V?® = 4nkp (1.7.4) 
At each point the field exerts on matter a force per unit volume given by: 
f= —pV® (1.7.5) 


Equations (1.7.4) and (1.7.5) are the field-theoretic version of Newton’s Law of 
Gravity. Potential theory is a powerful tool for working out the effects of 


34 RELATIVITY AND GEOMETRY 


gravity in actual physical situations, but it does not replace the conceptual 
foundation laid down by Newton. Poisson’s equation can be derived from 
Newton’s Law by straightforward mathematical reasoning.® And the gravi- 
tational potential at any time depends on the distribution of matter at that time 
and will instantaneously change throughout space when that distribution 
changes. 

The idea of Newtonian spacetime was originally developed as still another 
reformulation of Newton’s Theory of Gravitation. The version of that idea 
given in Section 1.6 holds only in the limit in which spacetime is practically 
void. But Newtonian spacetime in the presence of matter cannot be simply 
explained in terms of group-invariants. It is a manifold of variable curvature 
and, as such, it lies beyond the scope of Klein’s Erlangen programme. We shall 
refer to it briefly on page 191, in connection with Einstein’s theory of gravity. 


CHAPTER 2 


Electrodynamics and the Aether 


2.1 Nineteenth-Century Views on Electromagnetic Action 


In 1785 Coulomb verified by means of the torsion balance the law of elec- 
tric interaction that was proposed by Priestley in 1767. Charged bodies 
repulse each other—if charged with electricity of the same kind—or attract 
each other—if charged with electricity of different kinds—with a force directly 
proportional to their charges and inversely proportional to their distance 
squared. Coulomb verified also a similar law of magnetic interaction, first 
published in 1750 by Michell.! Naturally enough, Coulomb’s Laws—as they 
are usually called—were greeted as a splendid corroboration of the Newtonian 
paradigm of action at a distance by central forces. Volta’s invention in 1800 of 
a device capable of generating steady currents opened the way to the 
tremendous surge of electrical science in the 19th century. In 1820 Oersted 
discovered the action of a conducting wire on a magnetic needle. Thereupon 
Ampére conjectured that magnets are but congeries of tiny electric circuits, and 
that conducting wires must attract or repel each other. Combining clever and 
painstaking experimentation with brilliant reasoning he developed his 
“mathematical theory of electrodynamical phenomena inferred from ex- 
perience alone”.? Ampére analysed electric circuits into infinitesimal elements 
of current characterized by their current intensity and their spatial direction. 
Any two such elements interact, by Ampére’s Law, with a force directed along 
the line joining them and proportional to the product of their current 
intensities multiplied by a function of their directions and divided by their 
distance squared. Thus Ampére fulfilled his announced aim of reducing the 
newly discovered electrodynamic effects “to forces acting between two 
material particles along the straight line between them and such that the action 
of one upon the other is equal and opposite to that which the latter has upon 
the former”.> His compliance with the Newtonian paradigm must be taken 
with a grain of salt, however, for his centres of force are in fact line elements, 
endowed with both position and direction. 

Ampére’s theory predicted Oersted’s effect and the mutual attraction and 
repulsion between conducting wires that Ampére himself had conjectured and 
confirmed, but it did not predict the phenomena of electromagnetic induction 
discovered by Faraday in 1831. In 1847 Wilhelm Weber derived Faraday’s Law 
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of Induction and Ampére’s Law of Currents from the twofold assumption that 
electric currents are caused by the symmetric motion in opposite directions of 
particles charged with equal amounts of positive and negative electricity, and 
that Coulomb’s Law of electric interaction must be supplemented with two 
additional terms depending on the relative radial velocities and accelerations 
of the charged bodies. According to Weber’s Law, a particle with charge e acts 
on a particle with charge e’, at a distance r, with a force directed along the line 
joining both particles, of magnitude 


ee’ a 7 
(tae A. 
(1 53+) (2.1.1) 


Here c is a constant, of the dimensions of a velocity, equal to the ratio of the 
electromagnetic to the electrostatic unit of charge.* Weber’s electrodynamics 
found much acceptance in continental Europe, in spite of persistent criticism 
by Helmholtz. In his great Treatise of 1873, Maxwell presents it as the most 
interesting version of a way of conceiving electromagnetic phenomena—based 
on “direct action at a distance”—to which his own—based on “action through 
a medium from one portion to the contiguous portion”—is “completely 
opposed”.® Weber’s approach withered away when Maxwell’s won general 
approval, after Hertz in the late 1880’s experimentally established that 
electromagnetic actions propagate with a finite speed.’ 

Maxwell avowedly drew his inspiration from Faraday, who conceived the 
observable behaviour of electrified and magnetized bodies as the manifes- 
tation of processes occurring in the surrounding space. Faraday regarded 
charged particles and magnetic poles as the endpoints of “lines of force”, that 
fill all space, and can be made visible—between magnetic poles—by means of 
iron filings. Though Faraday’s views were scorned by the scientific establishment 
of the time because he was unable to formulate them in the standard language 
of mathematical analysis, “his method of conceiving the phenomena—as 
Maxwell rightly observed—was also a mathematical one, though not exhibited 
in the conventional form of mathematical symbols”.® If we define the electric 
intensity at a point of space at a given time as the force that would act at that 
time on a test particle of unit charge placed at that point, there is evidently 
defined, at each instant of Newtonian time, a vector field on all space, viz. the 
mapping assigning to each point of space the electric intensity vector at that 
point. An electric line of force is simply an integral curve of this vector field, 
that is, a curve tangent at each point to the electric intensity vector at that 
point.’ A magnetic line of force can be defined analogously. The chief problem 
of electrodynamics is then to find the laws governing the evolution of the 
electric and magnetic vector fields through time. Maxwell’s “general equations 
of the electromagnetic field” provide the classical answer to this problem.!° We 
have learned to read them as statements about geometric objects in space, and 
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it is worth noting that this reading agrees well with Faraday’s conceptions. But 
Maxwell himself understood them differently. The mechanical action ob- 
served between electrified or magnetized bodies involves the transmission of 
energy. If, as Maxwell assumed, after Faraday, this transmission is not 
instantaneous, the energy—he believed—must be stored up for some time ina 
materia! substance filling the “so-called vacuum” between the bodies. Maxwell 
regarded his general equations as a phenomenological theory of the behaviour 
of this substance. 

Maxwell’s view was all but radical. The existence of a pervasive material 
medium capable of storing energy on its way from one ordinary body to 
another was a familiar assumption of contemporary optics, which successfully 
explained the phenomena of light as the effect of transversal waves in that 
medium. The medium was called the aether—Descartes’ name for im- 
ponderable matter—and there was no dearth of hypotheses about its structure 
and dynamics. Maxwell had one very powerful reason for thinking that the 
opticians’ aether was also the carrier of electromagnetic action. His theory 
predicted the existence of transversal wavelike disturbances of electromagnetic 
origin, propagating in vacuo with a constant speed equal to the ratio c between 
the electromagnetic and the electrostatic units of charge (see equation 2.1.1 
and note 4). This ratio had been measured by Weber and Kohlrausch, and 
Maxwell observed that it differed from the known value of the speed of light in 
vacuo by less than the allowable experimental error.'! He concluded that light 
consisted of the electromagnetic waves predicted by his theory. 

In the introductory part of “A Dynamical Theory of the Electromagnetic 
Field”, read before the Royal Society in December 1864, Maxwell summarized 
his electrical philosophy. He said then: 


It appears therefore that certain phenomena in electricity and magnetism lead to the same 
conclusion as those of optics, namely, that there is an aethereal medium pervading all bodies, 
and modified only in degree by their presence; that the parts of this medium are capable of 
being set in motion by electric currents and magnets; that this motion is communicated from 
one part of the medium to another by forces arising from the connexions of those parts; that 
under the action of these forces there is a certain yielding depending on the elasticity of these 
connexions; and that therefore energy in two different forms may exist in the medium, the 
one form being the actual energy of motion of its parts, and the other being the potential 
energy stored up in the connexions, in virtue of their elasticity. 

Thus, then, we are led to the conception of a complicated mechanism capable of a vast 
variety of motion, but at the same time so connected that the motion of one part depends, 
according to definite relations, on the motion of other parts, these motions being 
communicated by forces arising from the relative displacement of the connected parts, in 
virtue of their elasticity. Such a mechanism must be subject to the general laws of Dynamics.'* 


In an earlier paper! * he had “attempted to describe a particular kind of motion 
and a particular kind of strain, so arranged as to account for the pheno- 
mena”.!* But he never repeated such an attempt. He continued to believe that 
“a complete dynamical theory of electricity” must regard “electrical action 
[. . .] as the result of known motions of known portions of matter, in which 
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not only the total effects and final results, but the whole intermediate 
mechanism and detail of the motion, are taken as the objects of study”.!*> But 
he obviated the need for such a complete theory by resorting to the dynamical 
methods of Lagrange, “which do not require a knowledge of the mechanism of 
the system”.'© He thus dodged the difficulty that beset all hypotheses 
concerning the mechanical workings of the aether: the aether must be infinitely 
rigid, for all disturbances are propagated across it by transversal, never by 
longitudinal waves; and yet it does not appear to offer the slightest resistance to 
the motion of the bodies immersed in it. While Maxwell’s friend and mentor, 
William Thomson, Lord Kelvin, devised one unsatisfactory mechanical model 
of the aether after another,'’ the next generation of electricians was ready to 
accept the aether governed by Maxwell’s Laws as a peculiar mode of physical 
existence, different from and perhaps more primitive and fundamental than 
the one exemplified by ponderable matter. In a letter to Hertz of September 
13th, 1889, Heaviside wrote: 


It often occurs to me that we may be all wrong in thinking of the ether as a kind of matter 
(elastic solid for instance) accounting for its properties by those of the matter in bulk with 
which we are acquainted; and that the true way, could we only see how to do it, is to explain 
matter in terms of the ether, going from the simpler to the more complex.'® 


2.2 The Relative Motion of the Earth and the Aether 


It is often said that Newtonian physics acknowledged the equivalence of 
inertial frames only with regard to dynamical laws and experiments, and that 
Einstein’s peculiar achievement was to declare them equivalent with respect to 
physical laws and experiments of every conceivable kind. This statement, which 
introduces Einstein as the young genius who boldly and successfully extended 
a principle that his predecessors had niggardly restricted, is useful in the 
classroom, where one must try to win over to Einstein’s theory the un- 
sympathetic minds of adolescents, steeped in common sense; but it is not fair to 
Newton or to his 19th century followers. The distinction between dynamical 
and non-dynamical physical laws and experiments is completely out of place in 
Newtonian science, which, as we saw in Chapter 1, sought to understand all 
natural phenomena as effected by the forces and governed by the laws 
disclosed by the phenomena of motion. And during most of the 19th century 
this Newtonian programme was faithfully adhered to by the physical 
establishment—excepting Michael Faraday, who for this very reason was 
generally spurned as a theoretician and acknowledged only as a great 
experimentalist. Of course, if we agree to restrict dynamics, by a wanton 
though viable semantic convention, to the direct interaction of ordinary, 
tangible matter, we must conclude that optics, according to the Young—Fresnel 
wave theory of light, and electromagnetism, as conceived by Maxwell and his 
school, lie beyond its scope. But even then we cannot conclude that they also lie 
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beyond the scope of the Newtonian Principle of Relativity. The gist of this 
principle is that any two experiments will have the same outcome with regard, 
respectively, to two inertial frames, if all the physical objects relevant to either 
experiment are initially in the same state of motion relative to the correspond- 
ing frame. Now according to late 19th-century physics the principle will 
certainly apply to optical and electromagnetic experiments provided that the 
aether is included among the physical objects whose state of relative motion is 
taken into account.! The proviso is sensible enough. After all, in a “purely 
dynamical” experiment such as dragging a marlin from a boat nobody would 
expect the tension on the fishing-line to be the same regardless of the currents 
in the water. 

Maxwell’s equations were expressed in terms of a Cartesian coordinate 
system or in the now familiar compact vectorial notation. In either form they 
are referred to a definite rigid frame. Are there any restrictions on the choice of 
the latter? Maxwell does not seem to have paid much attention to this question, 
but there are signs that he thought that almost any frame one could reasonably 
care to choose would do, if one was willing to ignore—as indeed one must— 
experimentally indiscernible differences. In Maxwell’s theory, the outcome of 
electromagnetic experiments depends on the behaviour of (i) a system of 
ponderable bodies—-conductors, magnets, dielectrics—and (ii) the aether. In 
the Treatise, §§600f., Maxwell argues that “the electromotive intensity is 
expressed by a formula of the same type”, whether we refer it to “a fixed system 
of axes”—that is, I presume, one relatively to which the fixed stars are at rest— 
or to a system of axes moving with uniform linear and uniform angular 
velocity with respect to the former. The only change that must be introduced in 
the said formula as a consequence of such a transformation of coordinates 
does not alter the value of the electromotive force round a closed circuit. 
Therefore— Maxwell concludes—“in all phenomena relating to closed circuits 
and the currents in them, it is indifferent whether the axes to which we refer the 
system be at rest or in motion”.? So much for the motion of conductors. That 
of magnets and dielectrics is not explicitly discussed in the Treatise. As to the 
aether, since the speed c with which electromagnetic action propagates 
through it is involved in Maxwell’s equations, the equations obviously cannot 
take the same form with respect to any frame, whether the aether is at rest or in 
motion in it. It would seem, however, that Maxwell believed that any changes 
in their form arising from this consideration could be safely neglected, at least 
for the time being, because the differences due to the motion of the aether 
relative to such frames of reference as our terrestrial physicists can be expected 
to choose were still beyond the reach of experimental detection.? 

The appropriate frame of reference for the electromagnetic field equations, 
or rather, the appropriate form of the latter when referred to one frame or 
another, was explicitly considered by Heinrich Hertz. His first exposition of 
Maxwell’s theory, which had a tremendous impact on the young physicists of 
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continental Europe, dealt with “the fundamental equations of electromag- 
netics for bodies at rest”.* It was followed, in the same year, by a second paper, 
“On the fundamental equations of electromagnetics for bodies in motion”.° 
Hertz remarks here “that whenever in ordinary speech we speak of bodies in 
motion, we have in mind the motion of ponderable matter alone”. However, 
according to his view of electromagnetic phenomena “the disturbances of the 
ether, which simultaneously arise, cannot be without effect; and of these we 
have no knowledge”. The question with which the paper is concerned cannot 
therefore “at present be treated at all without introducing arbitrary assump- 
tions as to the motion of the ether”.® The fact that we cannot remove the aether 
from any closed space would indicate indeed that “even in the interior of 
tangible matter the ether moves independently of it”. According to Hertz this 
would imply that the electromagnetic conditions of the aether and of the 
ponderable matter at every point of space must be described separately, by 
means of two distinct pairs of vectors. This complication can be avoided, 
however, 


if we explicitly content ourselves with representing electromagnetic phenomena in a 
narrower sense—up to the extent to which they have hitherto been satisfactorily 
investigated. We may assert that among the phenomena so embraced there is not one which 
requires the admission of a motion of the ether independently of ponderable matter within 
the latter; this follows at once from the fact that from this class of phenomena no hint has 
been obtained as to the magnitude of the relative displacement. At least this class of electric 
and magnetic phenomena must be compatible with the view that no such displacement 
occurs, but that the ether which is hypothetically assumed to exist in the interior of 
ponderable matter only moves with it.[ .. . ] For the purpose of the present paper we adopt 
this view. 


Having thus temporarily assumed that the aether inside a ponderable body is 
completely dragged along with the latter, Hertz goes on to propose a set of 
electromagnetic field equations applicable to bodies in motion in the frame to 
which the equations are referred. These can be obtained more or less 
straightforwardly from the Maxwell equations, as given in Hertz’s former 
paper, using the Galilei Rule for the Transformation of Velocities (p. 29).° But 
Hertz’s equations for bodies in motion clash with the outcome of some 
electromagnetic experiments (which, for the most part, had not yet been 
performed in 1890)°, and also with some familiar facts of optics, to which we 
now turn. 

In 1728 James Bradley announced “a new discovered Motion of the Fixt 
Stars”: annually, the apparent position of any star describes a small ellipse in 
the sky. Bradley had been trying to observe the stellar parallax, that is, the 
variation of the apparent position of a star caused by changes in the actual 
position of the earth. But the displacement he discovered could not be that 
which he sought for, because its amount varied precisely with the latitude of 
the stars observed, while the parallax depends on their presumably random 
distances to the earth; and also because each star was seen to move at any given 
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time perpendicularly to the direction in which the earth was moving, while the 
motion associated with parallax must be parallel and opposite to the motion of 
the earth. The corpuscular theory of light, then in fashion, furnished Bradley 
with a simple explanation of this phenomenon, subsequently known as the 
aberration of stars, or the aberration of light. According to that theory, 
starlight pours on us like rain, in straight lines parallel to the direction in which 
its source actually lies; however, in order to capture it with our telescope on the 
moving earth, we must tilt the instrument in the direction of motion, as we tilt 
our umbrella when we run under the rain. More precisely: if light falls 
vertically on the objective of a telescope moving with speed v relative to the 
fixed stars, in order that a light corpuscle may reach the eyepiece, the telescope 
must be tilted by a small “angle of aberration” «, such that tana = v/c— 
provided, of course, that the corpuscle travels inside the telescope with the 
same vertical velocity c with which it travels in vacuo.'° This result, though in 
excellent agreement with observation, appeared to clash with the wave theory 
of light, which in many ways could claim superiority over the corpuscular 
theory after 1800. For according to the wave theory the angle of aberration will 
not depend on the velocity of the telescope relative to the source of light, but on 
its velocity relative to the medium of light propagation, and the latter, as G. G. 
Stokes remarked, can agree with the former only on “the rather startling 
hypothesis that the luminiferous ether passes freely through the sides of the 
telescope and through the earth itself”.!! Stokes, however, by a subtle 
mathematical argument, explained aberration in accordance with the wave 
theory on the seemingly natural assumption that the aether is completely 
carried along by the earth and the bodies on the earth’s surface, while its 
velocity changes as we ascend through the atmosphere, “till, at no great 
distance, it is at rest in space”.!? Almost thirty years earlier Fresnel (1818) had 
given a different wave-theoretical explanation of aberration, which, though 
based on a highly artificial hypothesis, achieved remarkable experimental 
success. Fresnel’s work was prompted by Arago’s researches on the refraction 
of starlight by prisms (e.g. telescope lenses). Arago’s measurements, carried out 
in 1810, while he still adhered to the corpuscular theory, had shown that 
aberration is not affected by refraction, in spite of the fact that light rays are 
bent by it to compensate for the change of the speed of light from its value c, in 
free space, to c/n in the refractive medium (where n is the medium’s index of 
refraction). Arago (1853) concluded that light corpuscles are emitted by stars 
with different speeds, and that only those whose speed lies within certain 
narrow limits can be perceived by the eye. But Fresnel thought these 
conclusions unlikely. He derived Arago’s results from the wave theory, on the 
hypothesis that the aether is partially dragged by refractive bodies. He 
postulated that the ratio of the aether’s density in free space to its density ina 
medium of refractive index n is 1/n?. He assumed moreover that a refractive 
body in motion drags along with it only the aether it stores in excess of that 
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contained in the same volume of free space. He then showed that, on these 
assumptions, the velocity of light in a medium of refractive index n, moving 
with velocity v, was equal, in the direction of the motion, to c/n + u(1 —1/n’); 
c/n being, as we know, the velocity of light in the same medium at rest. Fresnel’s 
theory accounted for the then known facts of aberration and also implied that 
the angle of aberration would not be affected by using a telescope filled with 
water, a prediction verified by Airy (1871).!° The theory achieved a major 
success when Fizeau (1851) measured the velocity of light in running water and 
found it to be in good agreement with Fresnel’s formula. 

In 1881 A. A. Michelson thought that he had confirmed Stokes’ theory by his 
first interferometer experiment, which failed to disclose a “relative motion of 
the earth and the luminiferous ether”.'* But H. A. Lorentz (1886) showed that 
Michelson had overrated the effect that such a motion would have on the 
outcome of the experiment; when corrected, it turned out to lie within the 
range of experimental error. Lorentz also argued that Stokes’ theory is 
untenable, because it has to postulate that the velocity field of the aether has 
zero curl and is therefore the gradient ofa potential, which is inconsistent with 
Stokes’ assumption that the velocity of the aether is everywhere the same on 
the earth’s surface. In 1886 Michelson and Morley repeated Fizeau’s 
experiment with increased accuracy, providing for the first time a direct 
quantitative confirmation of Fresnel’s “drag coefficient” (1 — 1/n?). So when in 
1887, using an improved interferometer, no longer liable to Lorentz’s 
strictures, Michelson and Morley again found no traces of the relative motion 
of the earth and the aether, they did not claim to have vindicated Stokes, but 
only expressed their perplexity, for their result was in open conflict with 
Fresnel’s hypothesis, which they had so splendidly corroborated a year 
earlier.'® 

Though Hertz’s electrodynamics of bodies in motion agreed, on the surface 
of the earth, with Stokes’ theory and Michelson and Morley’s experiment of 
1887, it was irreconcilable with Fresnel’s drag coefficient and Michelson and 
Morley’s result of 1886. Since it also clashed with some new electromagnetic 
experiments (see note 9) and its author died at 36 in 1894, it was generally 
abandoned. In the last decade of the 19th century the most influential 
contributions to the subject were those of H. A. Lorentz, who, deeply 
impressed by the necessity of explaining the drag coefficient in a less ad hoc way 
than Fresnel’s, conceived matter as consisting of charged particles swimming 
in a stagnant aether that remains unruffled by their motions. In the rest of this 
Section I shall sketch some ideas of Lorentz that are generally regarded as the 
immediate background of Einstein’s “Electrodynamics of Moving Bodies”.!® 

Matter, which Lorentz described as “everything that can be the seat of 
currents or displacements of electricity, and of electromagnetic motions”, 
comprises “the aether as well as ponderable matter”.!’ The latter is “perfectly 
permeable to the aether” and can move “without communicating the slightest 
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motion to the latter”.!® Lorentz postulates that all ponderable bodies contain 
a multitude of positively or negatively charged particles—which he called 
“ions” or “electrons’—and that electrical phenomena are caused by the 
displacement of these particles. The state of the aether is characterized at each 
point by the local values of two vector fields, the dielectric displacement d and 
the magnetic force h. The position and motion of the electrons governs the 
fields according to the equations: 


1 oh 
(2.2.1) 
V-h=0 Vxh=4n( ov) 


where p is the local charge density and v is the local velocity of matter.!? At 
each point the fields exert on matter a force per unit volume given by: 


f =p(4nc7d+v xh) (2.2.2) 


Since the aether is not subject to ponderomotive forces and the electrons do 
not act instantaneously on one another, Lorentz theory is incompatible with 
the Third Law of Motion. Lorentz did not shy away from this conclusion, but 
remarked that, though the Third Law is manifestly satisfied by the observable 
phenomena, it need not apply to the elementary interactions that cause 
them.?° Lorentz (1892a) derived the “Lorentz force” (2.2.2) from the 
“Maxwell—Lorentz” equations (2.2.1) by a variational argument. By integrat- 
ing over “physically infinitesimal” regions he also obtained from (2.2.1) 
Maxwell’s phenomenological equations for ponderable bodies, relating the 
four macroscopic fields of electric force E, electric displacement D, magnetic 
force H and magnetic induction B. In the same paper Lorentz derived Fresnel’s 
drag coefficient from his basic equations.?! 

Lorentz naturally assumed that equations (2.2.1) and (2.2.2) are referred to 
the frame, presumed inertial, in which the aether is permanently at rest. By 
subjecting them to a Galilei coordinate transformation one can see at once that 
this is the only frame in which the equations are valid as they stand. However, 
Lorentz did not identify, as some have said, the aether’s rest frame with 
Newton’s absolute space. He was well aware that to speak of “the absolute rest 
of the aether [. . .] would not even make sense”.*? 

In the course of his investigations Lorentz (1892a) constructs some auxiliary 
functions that enable him to calculate the state of the aether from data referred 
to a material system moving in the aether frame with constant velocity v. The 
differential equation for one of the said functions can be easily solved by 
suitably changing variables.”3 If the coordinate functions of the aether frame, 
t, xX, y,Z, are so chosen that the components of v with respect to them are 
Vv, = 0, Vv, = v, = 0,a Galilei coordinate system adapted to the moving frame is 


44 RELATIVITY AND GEQMETRY 


given by the transformation: 


t=t X¥=x-—vt 
_ = (2.2.3) 
y=y zZ=2 
Lorentz introduces the variables 
x' = px (2.2.4a) 
t=t—ax'/c (2.2.4b) 


where # = |(c?/v? —1)7'/?|and B = | (1 —v?/c?)” '/?|. Writing y’ for J, 2’ for Z, 
and substituting x for X in (2.2.4), in agreement with (2.2.3), we have that?+ 


, _ t—(vx/c?) , x—ot 
= 1 (e/e?) aa Jl (w/e?) 
wile (2.2.5) 
=F v=2 


Though the substitution (2.2.4) was devised only as an aid to calculation, it 
involves two ideas for which Lorentz found later a direct physical use. 
Equation (2.2.45) introduces a time coordinate function that depends on the 
spatial location of events, a “local time”, as Lorentz will call it, to distinguish it 
from the universal Newtonian time. Equation (2.2.4a) introduces a space 
coordinate function which increases in the direction of motion f times faster 
than the corresponding Cartesian one, as if it were evaluated by means of a 
standard measuring rod that had been shortened by a factor of B~'. Let us 
consider Lorentz’s physical application of these two ideas in the stated order. 

In his “Inquiry into electrical and optical phenomena in moving bodies” 
(1895) Lorentz develops a theory that he claims to be correct up to first order 
terms in |v|/c (v being the velocity of the moving system in question). Lorentz 
proves that the theory satisfies what he cails the Theorem of Corresponding 
States. Let r and ¢ be respectively the position vector function and the time 
coordinate function associated with the aether frame, and let the position 
vector r’ be given by the standard transformation: 


r=r-v (2.2.6) 


Introduce the “local time” variable 
t=t-svr (2.2.7) 


The Theorem of Corresponding States can then be stated as follows : 


If there is a system of bodies at rest in the aether frame and a solution of 
the field equations such that the dielectric displacement, the electric force 
and the magnetic force in that frame are certain functions of rand ¢, there 
is a solution for the same system of bodies moving in the aether frame 
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with velocity v, such that the dielectric displacement, the electric force and 
the magnetic force in the inertial frame in which the system is at rest are 


exactly the same functions of r’ and t'.2° 


Lorentz concludes that the relative motion of the earth and the aether cannot 
exert on electromagnetic experiments any effects of the first order in v/c 
(v being the relative speed of the earth and the aether).2° Though Lorentz was 
not quite explicit concerning the physical significance of his “local time” and 
twenty years later he once described it as “no more than an auxiliary 
mathematical quantity”, it is clear that the Theorem of Corresponding States 
cannot be experimentally significant unless the local time bears a definite 
relation to the physical time keeping processes observed in the laboratory.?’ 
Lorentz, however, did not try to explain how all such “natural clocks” can be 
affected, and affected in the same way, by carrying them across the aether. He 
was much more outspoken with regard to the contraction of moving metre 
rods, the second idea that we saw was implied by a physical interpretation of 
(2.2.4). To this we now turn. 

On August 18th, 1892, Lorentz wrote to Lord Rayleigh that he was “totally 
at a loss to clear away [the] contradiction” between Fresnel’s value for the 
aether drag coefficient, that agreed so admirably with practically all observed 
phenomena, and the null result of Michelson’s interferometer experiments, 
which was apparently the only exception to the rule.2® Soon thereafter it 
lighted on him that Michelson’s result could be understood at once if “the line 
joining two points of a solid body, if at first parallel to the direction of the 
earth’s motion, does not keep the same length when it is subsequently turned 
through 90°”, so that, if v is the speed of the earth in the aether frame, the first 
length is (1—v?/2c?) times the latter.2? According to Lorentz this 
suggestion—which formerly had been made by G. F. Fitzgerald (1889)—is not 
so implausible as it might seem. For, the size and shape of a solid body being 
determined by the molecular forces, “any cause which would alter the latter 
would also influence the shape and dimensions”. If electric and magnetic forces 
are transmitted by the aether “it is not far-fetched to suppose the same to be 
true of the molecular forces”. This hypothesis is untestable, for we are 
completely ignorant of the nature of the molecular forces; but if the influence 
of motion across the aether on them is equal to that on electric and magnetic 
forces, it will cause in any solid body a deformation of precisely the required 
amount. This conclusion is justified more rigorously in Lorentz’s “Inquiry” of 
1895. In § 23 of that work Lorentz studies the effect of motion across the aether 
on the electrostatic interaction between charged partices. Let S be a system of 
electrons at rest in the aether. If x, y, zare Cartesian coordinates adapted to the 
aether frame, we denote by F,, F, and F, the components, with respect to those 
coordinates, of the electric force field F, generated by S. Let S’ be a system 
equal to S, except for the following two conditions: (a) S’ moves across the 
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aether with velocity v and velocity components, with respect to the said 
coordinates, v, = v, v, = Vv, = 0; (b) in S’ the dimensions of the electrons and 
the distances between them are shortened in the direction of motion by the 
factor B~' = (1 —v?/c?)'/”. Lorentz shows that the components of the electric 
force field F’ generated by S’, with respect to a coordinate system x’, y’, 2’, 
adapted to the rest frame of S’, are related to F,, F, and F, by the simple 
equations 


Fv=F, Fy=f'F, F,= 6° 'F, (2.2.8) 


provided that (i) the primed coordinate system is not Cartesian; (ii) if x, ¥, Zare 
Cartesian coordinates adapted to the rest frame of S’ and so oriented that the 
x-axis glides along the x-axis with speed v, the primed system is related to the 
barred system by the transformation we met already on page 44: 


x’ = BX y=)y Sz (2.2.4a) 


In §§ 89 ff. Lorentz discusses Michelson’s results,°° arguing that they can be 
reconciled with the hypothesis of the stagnant aether if we assume that the 
molecular forces that keep together a solid body, such as the arms of the 
interferometer, are subject to the same transformation law (2.2.8) that is 
applicable to electrostatic forces. For, on this assumption, the equilibrium of 
the molecular forces can only be maintained when the body moves across the 
aether with constant speed », if its dimensions are shortened in the direction of 
motion in the ratio of 1 to (1 —v?/c?)'/*. Lorentz readily admits that “there is 
certainly no reason why” the said assumption should hold, but he apparently 
regarded Michelson’s result as evidence for it.?1 He also expected that the 
effect would not be quite so neat in reality, since the molecules of a solid body 
are not exactly at rest but oscillating relative to each other. However, the 
interferometer experiments were not so exact that such small effects could 
modify their results.2? 

In his “Simplified theory of electrical and optical phenomena in moving 
systems” (1899), Lorentz extends his Theorem of Corresponding States to 
effects of the second order (i.e. effects of the order of v?/c?, where v is the speed 
of motion across the aether).*> With this purpose he introduces still one more 
coordinate system, slightly different from the primed system with respect to 
which the Theorem is valid up to first order terms. So we now have to do with 
four coordinate systems: 


(i) The unprimed Galilei system (t, x, y, z), adapted to the aether frame. 

(ii) A Galilei system that we designate by the barred variables (f, x, ¥, Z), 
adapted to the rest frame of a material system moving across the 
aether with velocity v. This system is so chosen that the X-axis glides 
along the x-axis with speed v = |v |. It is therefore related to (i) by the 
Galilei transformation (2.2.3). 
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(iii) The primed system to which the first order Theorem of Cor- 
responding States is referred in this paper of 1899 (see note 26), and 
which is related to (ii) by the transformation: 


(2.2.9) 


y=y 4 
(iv) A doubly primed system related to (iii) by the transformation: 
t" = t'/Be x"=x'/é y=y'/e Zz" =2z'/e (2.2.10) 


where ¢é is “an indeterminate coefficient differing from unity by a 
quantity of the second order”, and f retains the value we have given it 
throughout the present Section.** 


Lorentz also gives transformation formulae for the components of the 
electromagnetic fields with respect to (iii) and (iv), and goes on to argue that the 
Theorem of Corresponding States will hold up to terms of second order if we 
substitute the doubly primed system for the simply primed one in the previous 
formulation of it, provided that we not only suppose that a system at rest in the 
aether may be changed according to the given transformation into the 
imaginary doubly primed system, moving across the aether with velocity v, 
“but that, as soon as the translation is given to it, the transformation really 
takes place, of itself, ie. by the action of the forces acting between the particles 
of the system, and the aether”.>° 

Spoiled as we are by the elegance and clarity that Einstein brought into 
classical electrodynamics we find Lorentz’s method of successive approxim- 
ations vexingly cautious. And yet it cannot be denied that, though it procured 
little insight into the structure of things, it did yield some astounding 
predictions. Thus, the second order theory of Lorentz (1899) implies that the 
inertial mass of a charged particle must vary with its speed (relative to the 
aether), and indeed in such a way that it will oppose a different resistance to 
accelerations parallel and perpendicular to the direction of motion.°° If matter 
in motion across the stagnant, ubiquitous aether suffers such impressive 
changes, there is good reason to look on the aether frame as a standard of rest. 
It is nevertheless ironic that these effects which single out in theory one inertial 
frame among all others—and thus make an end of Newtonian Relativity— 
should have been postulated to explain the impossibility of ascertaining in 
practice which bodies move and which are at rest in that frame. But it is time 
now that we address ourselves to Einstein’s own work, for the study of which 
we have gathered, I hope, sufficient background information. 


CHAPTER 3 


Einstein's “Electrodynamics of Moving Bodies’ 


3.1 Motivation 


In an undated letter to Conrad Habicht, probably written in the Spring of 
1905, Einstein mentions four forthcoming papers of his.' The first, which was 
about to appear, he describes as “very revolutionary”. It was the article “On a 
heuristic point of view about the creation and conversion of light”, in which 
Einstein, boldly generalizing Planck’s quantization of energy, postulated that 
“the energy of light is distributed discontinuously in space” in the form of 
localized “energy quanta ... which move without being divided and which 
can be absorbed or emitted only as a whole”. The fourth paper, still a draft, 
would contain “an electrodynamics of moving bodies, involving modifications 
of the theory of space and time”.? This article, submitted in June 1905 to 
Annalen der Physik and published in September, laid down the foundations of 
Special Relativity. In 1952 Einstein told R.S. Shankland that the subject of this 
paper “had been his life for over seven years”, that is, from 1898 to 1905, while 
that of the former had busied him for five, that is, since the publication of 
Planck’s trailblazing paper of 1900.* Though it would not be easy to guess 
from either text that they were written by the same man, practically at the same 
time, the insight leading to the paper on radiation inevitably influenced 
Einstein’s peculiar approach to electrodynamics. 

In his Autobiographical Notes Einstein describes the effect of Planck’s 
energy quantization on his mind: “All my attempts to adapt the theoretical 
foundations of physics to [Planck’s approach] failed completely. It was as if 
the ground had been pulled out from under one’s feet, with no firm foundation 
to be seen anywhere, upon which one could have built”.* His main problem in 
those years was to ascertain “what general conclusions can be drawn from 
{ Planck’s] radiation formula regarding the structure of radiation and, more 
generally, regarding the electromagnetic basis of physics”.® Shortly after 1900 
it was already clear to him that classical mechanics could not, except in limiting 
cases, claim exact validity.’ The “heuristic viewpoint” on electromagnetic 
radiation implied that classical electrodynamics was not exactly valid either.® 
Although Einstein had the highest regard for Lorentz’s achievements, he could 
no longer subscribe to the latter’s “deep” theory of electron—aether interaction, 
and it would have been premature to replace it by some new-fangled 
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speculation on the fine structure of matter. On the other hand, the predictive 
success of Maxwell’s equations at the macroscopic level was beyond question. 
One could therefore incorporate them without qualms in a wide-ranging 
phenomenological theory.? The paradigm of a phenomenological theory was 
classical thermodynamics, which attained accurate knowledge of thermal 
phenomena, without any hypotheses as to the ultimate nature of heat, by 
translating into exact mathematical language and elevating to the rank of 
universal physical principles two familiar “facts of life”, viz. that there is no 
inexhaustible source of mechanical work, and that a cooling machine cannot 
run solely on the heat it extracts. The principles of thermodynamics constrain 
the laws of nature to be such “that it is impossible to construct a perpetuum 
mobile (of the first and second kind)”.!° Einstein was persuaded that only the 
discovery of a universal principle of this sort could rescue him from the 
quandary he was in, and lead him to assured results. 

The universal principle finally proposed by Einstein in 1905 prescribes that 
“the laws of physics are invariant under Lorentz transformations (when going 
from one inertial system to another arbitrarily chosen inertial system)”.!! As 
we shall see in Section 3.3, this follows from the joint assertion of the 
thoroughgoing physical equivalence of all inertial frames, proclaimed by the 
Relativity Principle (RP), and the constancy of the speed of light in vacuo 
regardless of the state of motion of its source, embodied in what we shall 
hereafter call the Light Principle (LP). The RP was firmly entrenched in the 
tradition of classical physics and could moreover be said.to rest, like the 
Principles of Thermodynamics, on familiar observations (cf. the quotation 
from Galilei on page 31). But the LP was traditionally accepted only as a 
consequence of the wave theory of light, supported by the vast array of facts 
that corroborated this theory, but not resting on any direct evidence of its own. 
Einstein’s decision to retain the LP, upgrading it from a corollary to a basic 
principle, at a time when he had lost faith in the exact validity of the wave 
theory, was a daring but hugely successful gamble.!? 

The LP is indeed too vague, unless one specifies the frame in which light 
travels with constant speed. According to the wave theory this must be the 
frame in which the luminiferous medium oscillates about a stationary position. 
But Einstein had no use for a luminiferous medium and simply referred the LP 
to an arbitrarily chosen inertial frame. This would clash with the RP if, as 
hitherto assumed, velocities transform according to the Galilei Rule when 
going from one inertial frame to another.!? However the apparent incon- 
sistency is dispelled by Einstein’s discovery that the RP, as classically 
understood, is too vague also, because it does not specify how time ts 
“diffused” throughout an inertial frame. If this is done in the manner 
prescribed in Einstein (1905d), § 1, the conjunction of RP and LP (in their now 
precise meanings) implies that the Galilei Rule for the Transformation of 
Velocities is not valid. The rule of velocity transformation that actually follows 
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from Einstein’s principles does not lead to inconsistencies. 

Before going further into the matters we have just sketched it will be useful 
to mention some of the reasons that moved Einstein to uphold both RP and 
LP. In the Autobiographical Notes he speaks of a “paradox” that had haunted 
him since he was a lad: A man travelling with speed c after a beam of light 
would perceive the beam as a static field, periodic in space. Einstein deemed the 
situation unnatural.'* The joint validity of RP and LP makes it utterly 
impossible. In the preamble to (1905d), Einstein argues persuasively for the RP 
alone. In the then fashionable version of classical electrodynamics Maxwell’s 
equations were referred, in their standard form, to a particular inertial frame, 
the aether’s rest frame. This leads to startling and altogether unwarranted 
asymmetries in the description of otherwise indistinguishable phenomena. 
Take, for instance, the interaction of a magnet and a conductor. 


If the magnet is in motion and the conductor at rest, there arises in the neighbourhood of the 
magnet an electric field with a definite energy, producing a current at the places where the 
parts of the conductor are situated. But if the magnet is at rest and the conductor in motion, 
no electric field arises in the neighbourhood of the magnet. In the conductor, however, there 
arises an electromotive force, to which no energy corresponds, but which—assuming 
equality of relative motion in the two cases discussed—generates electric currents of the 
same path and intensity as those produced by the electric forces in the former case.'® 


Examples of this kind, “together with the unsuccessful attempts to detect a 
motion of the earth relative to the ‘light medium’ ”,'° suggest that the Principle 
of Relativity applies to electrodynamic phenomena and laws, not—as Hertz 
professed—if due allowance is made for the state of motion of the aether,'’ but 
if the aether is completely ignored—or if it does not exist at all—and inertial 
frames are defined exclusively by reference to ponderable bodies. At this point 
of the preamble Einstein introduces the LP in an informal version similar to 
the one I gave above. He does not argue for it, but adds only that RP and LP 
are all that is needed for developing “a simple and consistent electrodynamics 
of moving bodies, based on Maxwell's equations for bodies at rest”.'® In 1950, 
however, Einstein told R. S. Shankland of a specific reason for upholding the 
LP. Before 1905 he had countenanced a ballistic theory of light emission, 
similar to that developed by Ritz (1908a, b), but had rejected it because “he was 
unable to think of any form of differential equation that could have solutions 


representing waves whose velocity depended on the motion of the source”.'? 


3.2. The Definition of Time in an Inertial Frame 


It is generally agreed that Einstein’s implicit criticism and outright dismissal 
of Newton’s universal time is somehow the key to Special Relativity. Opinions 
differ, however, as to the true import of the conception of physical time and 
time order that Einstein introduced instead. While one very influential school 
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maintains that in Einstein’s view there can be no natural partition of events 
into simultaneity classes, so that a universal time order can only be settled by 
convention, other writers believe that in the world of Special Relativity there 
are infinitely many such natural partitions, one for each inertial frame.1 We 
shall postpone the consideration of these philosophical positions until Section 
7.1. At this point we shall only examine Einstein’s own discussion of time order 
in an inertial frame in his (1905d), § 1, supplementing it with such remarks as 
may seem necessary for a clear and consistent reading of the text. 

At the beginning of §1 Einstein bids us choose “a coordinate system in 
which the equations of Newtonian mechanics hold good”. He proposes that, 
for easy reference, we call it the “stationary” system. He remarks that if a 
material particle is at rest relatively to this system its position can be defined 
with regard to it by means of measuring rods and the methods of Euclidean 
geometry, and can be expressed in Cartesian coordinates. We shall understand 
therefore that the stationary system is a Cartesian coordinate system fora rigid 
frame—hereafter called the “stationary frame”. It follows from the remaining 
sections of Einstein’s paper that neither the Second nor the Third Newtonian 
Laws of Motion do really hold good in the said frame. To avoid a blatant 
inconsistency we must conclude that Einstein expects only the First Law of 
Motion to be true in the stationary frame. The latter is therefore best described 
as an arbitrary inertial frame in Lange’s sense.” We saw on pages 19-20 that in 
the Newtonian theory the Third Law is quite essential for distinguishing the 
family of inertial frames from any family of frames travelling past it with the 
same constant acceleration. But the Third Law is not required for singling out 
the inertial frames in Einstein’s theory. Einstein does not go into the question 
here perhaps because he thought that to state it clearly he needed the very 
concept of time which he still was in the course of defining. But in the light of 
his later writings there can be no doubt that the following is assumed by him: If 
an inertial and a non-inertial frame move past each other with uniform 
acceleration, a light-ray emitted through empty space in a direction normal to 
the mutual acceleration of the frames, describes a straight line in the inertial 
frame, a curved line in the other.? The rectilinear propagation of light in vacuo 
provides therefore an additional criterion for the identification of inertial 
frames, which, be it noted, does not presuppose a definition of time. Summing 
up, we may say that Einstein’s stationary system is a Cartesian coordinate 
system defined on the relative space S; of a rigid frame F, which meets the 
following conditions: 


(I) Three free particles projected non-collinearly from a point in F 
describe straight lines in Sy. 

(II) A light ray transmitted through empty space in any direction from a 
point in F describes a straight line in Sr. 


After making the remark reported above concerning the assignment of 
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coordinates to a particle at rest in the stationary system, Einstein goes on to 
say: 


If we wish to describe the motion of a material point we give the values of its coordinates as 
functions of the time. Now we must bear in mind that such a mathematical description will 
only have a physical meaning if we are quite clear beforehand as to what we understand by 
“time”.4 


He then introduces the distinction that we mentioned in Section 1.3 between 
time at a point and time throughout space. We saw there that Newton’s “true 
and mathematical time” was relevant to the scientific description of pheno- 
mena only insofar as it was embodied in natural processes which, according to 
the Laws of Motion, do, in ideal circumstances, keep true time. But we also 
observed that such Newtonian clocks keep time only at the point of space 
where they are actually located—the point, that is, where the event designated 
for the role of time counter is taking place. Though Einstein cannot resort any 
longer to the Newtonian Laws of Motion for the definition of foolproof time- 
keeping devices, he unhesitatingly relies on the concept of a natural clock, that 
keeps true time at a point of space. Indeed the possibility of such clocks follows 
at once from the Law, postulated “in agreement with experience” near the end 
of § 1, that the round trip speed of light through empty space in the stationary 
frame is a universal constant. As Pierre Langevin observed, a light-pulse 
travelling to and fro between two parallel mirrors facing each other at the 
endpoints of a rigid rod at rest in that frame constitutes such a natural clock. 
The “Langevin clock” obviously keeps time only at either end of the rod, and 
one cannot even assert, without further stipulations, that it keeps the same time 
at both ends. But the time shown by a clock at a point in the stationary frame 
cannot be used for describing the successive positions of a particle moving in 
that frame. As the particle goes from one point to another, the time must be 
read in different clocks. How are they to be set in agreement with one another? 
As early as 1884 James Thomson had realized that time is not diffused of itself 
throughout all space, but must be propagated from point to point by some 
method of signalling. He assumed, however, as a matter of course, that every 
plausible signalling method—such as transmitting light pulses at appointed 
times, or sending out messengers furnished with accurate, stable clocks, set by 
the clock at the origin—would ideally lead to the same universal system of 
chronometry.> Thomson knew that this assumption was not necessarily true. 
Einstein ventured to think that it was contingently false. He was apparently 
willing to allow any method for propagating time over space, provided that it 
met the basic requirement of any physical method, namely, that it be freely 
reproducible and yield consistent results. But he was intent on defining 
without ambiguity whatever method might be finally adopted, on stipulating 
exactly the idealized physical conditions in which two events should be said to 
happen at the same time. By thus substituting precision for Thomson’s 
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looseness, Einstein avoided or diminished the risk of using mutually 
inconsistent methods, that may seem equivalent in the eyes of common sense. 

To emphasize the latitude of choice that we enjoy in this matter Einstein 
proposes, with tongue in cheek, the following method for defining time on the 
stationary frame. Let A be a natural clock at rest in the frame. Let P be any 
point of the frame. An event E at P will be assigned the time coordinate t if A 
shows time tg upon reception of a light signal sent through empty space from P 
to A when E occurred. Einstein remarks that this method has the disadvantage 
that it is not independent of the location of A. It is also liable to a stronger 
objection. If time is defined on the stationary frame by this method a free 
particle moving in that frame will not travel with constant velocity, but will 
suffer a sudden loss of speed when its distance from A attains its minimum 
value. Indeed the requirement that the First Law of Motion be valid on the 
stationary system implies that the time coordinate function cannot be chosen 
altogether freely, for it must agree with Lange’s definition of an inertial time 
scale: 


(III) A free particle moving in the stationary frame F traverses equal 
distances in equal times. 


But this requirement does not determine the time coordinate function 
uniquely. Apart from the trivial possibility of changing the origin of time and 
the time unit, it is clear that, if x, y and z are the space coordinate functions of 
the stationary system and ¢ is a time coordinate function compatible with (III), 
the function 


t=t+taxtby+ez (3.2.1) 


(where a, b and c are any three real numbers) is also compatible with (IID). 

The ambiguity that is still inherent in Lange’s stipulations is finally 
overcome by Einstein’s second and, as he says, “more practical” proposal for 
defining the time coordinate function for the stationary system. Let A and B be 
two clocks at rest anywhere in the stationary frame. Let L be a light signal 
transmitted across empty space from A to B, reflected without delay at B, and 
retransmitted in vacuo again from B to A. (I shall eventually refer to signals of 
this sort as bouncing light signals or radar signals.) Let t, and tp be, 
respectively, the times of emission and reception of L at A. The clock B can 
now be set in synchrony with the clock A if we “establish by definition that the 
‘time’ required by light to travel from A to B equals the ‘time’ it requires to 
travel from B to A”.® B will then be synchronous with A if and only if the signal 
Lis reflected at B when this clock shows the time (tp —t,)/2. An event E at the 
location of B will be assigned the time coordinate ¢ if B shows time t when E 
occurs. 

Einstein’s definition of time on the stationary frame F is therefore 
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equivalent to the following stipulation: 


(IV) A light pulse transmitted through empty space in any direction from 
a source at rest in F traverses equal distances in equal times. 


Though Einstein here presents (IV) as a matter of definition, he must implicitly 
assume that it is compatible with (III), or the stationary frame would not be 
inertial as was prescribed at the outset. He assumes also that all time 
coordinate functions defined by the foregoing method from any point on the 
stationary frame will agree with each other up to a change of origin and time 
unit.’ These are factual hypotheses that could very well clash with experience. 
But we need not linger on them for they follow at once from the much stronger 
factual assumption introduced by Einstein at the beginning of § 2: the Light 
Principle. 

A time scale satisfying conditions (III) and (IV) will be called Einstein time. 
Two events that happen at the same Einstein time are said to be Einstein 
simultaneous. 


3.3. The Principles of Special Relativity 


Let F bea rigid frame satisfying the conditions (I) and (II) of Section 3.2. We 
shall say that F is an inertial frame (in Einstein’s sense). Let X, ¥ and Z be the 
three coordinate functions of a Cartesian system for F. Let the real valued 
function t assign a “time” to every “possible event” in agreement with 
conditions (III) and (IV) of Section 3.2. As we did on page 23 we now define 
the space coordinate functions x, y and z on the set of “possible events” by the 
following simple method: if the event E occurs at the point P of F, x(E) = x(P), 
y(E) = ¥(P),and 2(£) = 2(P). The four functions ¢, x, y, z,in that order, define a 
mapping of “possible events” onto R*. Any mapping that is constructed in this 
way, using a given system of stable but otherwise arbitrary units of length and 
time will be called a Lorentz chart, adapted to the frame F.’ We are now ina 
position to state the RP and the LP with due precision. 


The Principle of Relativity (RP). The laws by which the states of physical 
systems undergo change are not affected, whether these changes of state 
be referred to the one or the other of two Lorentz charts. 

The Light Principle (LP). A light ray moves in the “stationary” frame 
defined in Section 3.2 with the constant speed c, whether it be emitted bya 
stationary or by a moving body. (Here “speed” is the ratio of the distance 
traversed by the light in the relative space of the frame, to the difference 
between its times of reception and emission as determined by the time 
coordinate function of a Lorentz chart adapted to the frame.)? 


The “laws” mentioned in the RP are not the real relations between classes of 
events that a philosopher would call by that name, but rather the relations 


EINSTEIN'S ‘ELECTRODYNAMICS OF MOVING BODIES’ 55 


between coordinate functions, and functions of such functions, by which the 
physicist seeks to express the former. If the RP spoke about real relations of 
events it would be trivial: obviously, such relations cannot be affécted by the 
choice men make of a coordinate system for describing them. But if it concerns 
the mathematical expression of such relations in terms of coordinate systems 
of the kind specified above, the RP is a very powerful criterion for the selection 
of viable physical laws. As the reader probably knows, it implies that Maxwell’s 
equations do, while Newton’s Second and Third Laws of Motion do not, 
qualify as candidates for that status. 

From the “heuristic viewpoint” of Einstein’s (19055) the LP may be 
regarded as the Principle of Inertia applicable to light quanta. (For greater 
clarity, one ought perhaps to replace in the above statement of the LP, 
Einstein’s expression “a light ray moves”, which anyway sounds peculiar, by “a 
light point moves”.) We must not lose sight of the striking analogy between 
Lange’s treatment of the classical Principle of Inertia (IP), that governs—also 
in Special Relativity—the free motion of material particles, and the manner 
how Einstein introduces the LP. Lange saw that the IP is meaningless unless 
we define at least one frame in which it holds good. Lange defines such a frame 
by considering the behaviour of three free particles in motion. (See Definitions 
1 and 2 on page 17, reproduced as conditions (I) and (IT) in Section 3.2.) The 
IP is the factual statement that every other free particle moves in the frame thus 
defined in the same way as those three, that is, in a straight line, with constant 
speed (page 17, Theorems | and 2). Lange’s definition is good enough for the 
task he set himself, but for the purpose of stating a general principle 
concerning the propagation of light it is too vague. Even if we narrow it down 
by adding the condition (II) of Section 3.2 it is still compatible with infinitely 
many alternative statements about the velocity of light—as one may readily 
infer from equation (3.2.1). The ambiguity is removed by adding the condition 
(IV) of 3.2, which Einstein handles as a definition, just as Lange did with 
conditions (I) and (III). This condition (IV) concerns the behaviour of a single 
light-pulse, propagating in every direction from a point at rest in a Lange 
inertial frame. Time is to be defined on the frame in such way that at each 
instant the light-front lies on a sphere with its centre at the source and its radius 
proportional to the time elapsed since the light was emitted. Having thus 
defined the requisite system of reference, Einstein is able to formulate the LP. 
It is the factual statement that every other light-pulse propagates in the said 
frame in the same way as the former, regardless of the state of motion of its 
source; i.e. that at each time—determined in the agreed manner—the light-front 
will lie on a sphere with its centre at the point of the frame where the source was 
placed at the moment of emission, and its radius proportional to the time 
elapsed since that moment. The LP remains hopelessly imprecise if the system 
to which it is referred is not defined beforehand. In this the LP does not differ 
essentially from the IP. It is a sure mark of Lange’s and Einstein’s genius—yet 
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at the same time a source of seemingly endless philosophical befuddlement— 
that they so contrived the requisite definitions that the prototype objects—the 
three free particles, the light-pulse—involved in them were made to follow by 
convention precisely the same pattern of behaviour which the remaining 
objects of their kind turn out to have as a matter of fact (to the approximation 
within which Special Relativity holds good). Such a beautiful consistency of 
description could not have been achieved in this way if the world were other 
than it is. But the factual support that makes for the truth of a physical theory 
need not blind us to the freedom that must be exercised in order to 
state such a theory intelligibly. 


3.4 The Lorentz Transformation. Einstein’s Derivation of 1905 


Let x and y be Lorentz charts. The mapping y:x~?, defined on the range of 
x in R*, and with values in R‘, is called a Lorentz coordinate transformation. 
The RP implies that the laws of physics, referred to an arbitrary Lorentz chart, 
must preserve their form when subjected to a Lorentz transformation. To 
make Special Relativity into a workable physical theory we therefore need to 
know the general equations that govern such transformations. In §2 of his 
(1905d) Einstein inferred those equations from the RP and the LP. There are 
several other ways of deriving the Lorentz transformation equations, some of 
which are perhaps simpler, or more elegant, than Einstein’s, or proceed froma 
different set of assumptions. We shall comment on some of them in Sections 
3.6 and 4.6. But we shall first give here Einstein’s original derivation in 
full.? 

We assume that every Lorentz chart agrees with measurements performed 
with natural clocks and unstressed rods at rest in the frame to which the chart 
is adapted. We also assume hereafter that the stable system of length and 
time units employed in Lorentz charts is so chosen that speeds are pure 
numbers.” (This can be achieved by choosing, say, the metre as the unit of 
length and a convenient multiple of it as the unit of time—a metre of time 
being the time required by a light signal for traversing a metre of length in an 
inertial frame in empty space— or by choosing the second as the unit of time 
and a suitable fraction of the light-second—which is equal to 2.998 
x 10° metres—as the unit of length, etc.) c designates the speed of light in 
vacuo in the chosen units, relative to an inertial frame in which a global time 
has been defined by Einstein’s method. If x is a Lorentz chart, we denote its 
time coordinate function by x°, its space coordinate functions by x!, x? and x’. 
Latin and Greek indices are used as indicated on page 23. As on page 54 we 
distinguish carefully between the Lorentz space coordinate functions x’, 
defined on events, and the corresponding Cartesian coordinate functions x’, 
defined on the relative space S; of the frame F to which the Lorentz chart x is 
adapted. The value of x* at a given point of S;agrees with the value of x* at any 
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event taking place at that point, but the arguments of X* and x? are never the 
same. 
If x and y are Lorentz charts adapted to the same frame, it is clear that 


xo= ty +k 


3.4.1 
x*= Lpaapy? + k, ( ) 


for a suitable set of real numbers k;, a,,, such that the matrix (a,,) is 
orthogonal. A Lorentz transformation governed by equations (3.4.1) will be 
said to be non-kinematic (nk). We may obviously, without loss of generality, 
concentrate our attention on the special case in which x and y are adapted, 
respectively, to two different inertial frames F and G, such that G moves in F 
with velocity v £ 0, and both meet some initial conditions that will simplify 
our calculations. Obviously, |v| # c, for otherwise a light-signal emitted in F in 
the direction of vy would remain forever at the same point of G, in violation of 
the LP and the RP. (According to the latter we can always regard G as the 
“stationary” frame mentioned in the former.) The initial conditions that we 
shall impose on x and y are the following: 


(i) Both x and y assign the quadruple (0,0,0,0) to the same event; 
(ii) if o is a permutation of {1,2,3} and E is an event such that 
x°(E) = x7) (E) = x9) (E) = 0, then y?") (E) = y*) (E) = 0; 

(iii) if E is an event such that y'(E) = y?(E)= y?(E)=0, then 
x?(E) = x3(E) = 0, and x!(E) = vx°(E), where v stands for |v); 

(iv) if E, and E, are two events such that x°(E,) <x°(E,) and 
x"(E,) = x*(E,) = 0 (a = 1, 2, 3), then y°(E,) < y°(E,).? 


Two Lorentz charts x and y that satisfy the foregoing conditions will be said 
to match each other. If z and w are two arbitrary Lorentz charts, the 
transformation w:z~' can always be represented by the composite mapping 
(w-y7!)-(y>x7!)-(x-27'), where x and y are matching Lorentz charts 
adapted to the same frames as z and w, respectively, and w: y~' and x-z~! are 
nk transformations. 

Einstein declares that the sought for transformation equations must be 
linear “on account of the properties of homogeneity that we attribute to space 
and time”.* We shall discuss this statement in Section 3.6. For the time being, 
let us regard the linearity of the transformation between the matching charts x 
and y as a lemma, to be proved later. 

Einstein introduces an auxiliary coordinate system which we shall denote by 
u. It is defined as follows: 

u? = x°® ul = x! —px° 


u2 = x? w=x? 


(3.4.2) 


It is not hard to see that all events that take place at a fixed point of G agree in 
their u*-coordinates. Therefore one may say that u is a chart adapted to G. As 
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we shall see it is not a Lorentz chart. Although (3.4.2) is mathematically 
equivalent to the purely kinematic Galilei transformation (1.6.4), it would also 
be wrong to describe u as a Galilei coordinate system for G, not only because 
we have no grounds for identifying x°(= u°) with the universal Newtonian 
time, but, more importantly, because we do not know as yet whether the 
coordinate functions u* agree with distances on G, as Galilei space coordinate 
functions should. (As a matter of fact they do not, if the principles of Special 
Relativity hold good.) 

We shall now determine y° as a function of u. We consider a radar signal 
transmitted in vacuo from a point at rest in G, in a direction perpendicular to 
the (y?, y*)-plane through that point. Let A denote the emission, B the 
reflection and C the final reception of the signal at its point of emission. To 
simplify calculations we assume that A is the common origin of x and y (initial 
condition (i)), and that x'(B) > 0.° It follows that u(A) = (0,0,0,0). Since y is 
the time coordinate function of a Lorentz chart, it is clear that 


y°(A) + y°(C) = 2y°(B) (3.4.3) 


According to the LP, the signal propagates in F with speed c. Conse- 
quently, x'(B) = cx°(B) = cu°(B). Therefore, u'!(B) = (c —v)u°(B), and 
u(B) = (u'(B)/(c — v), u)(B), 0, 0). Since C takes place at the same point of G as 
A, u*(C) = 0. By (3.4.2), x1(C) = vx°(C) = vu°(C). But C is the reception of a 
signal reflected at B and travelling in the direction of decreasing x' during the 
time x°(C) — x°(B). Hence x!(C) = x! (B) —c(x°(C) —x°(B)). Using (3.4.2) 
and the foregoing results we infer that u°(C) = u'(B)(1/(c —v)+ 1/(c +0)). 
Writing @ for u' (B) and substituting into (3.4.3) we obtain: 


19-4 10,0,0,0)4 39 4-t (( = +, \0,0.0) 
c—v C+D 


= 29° s .8,0,0) (3.4.4) 
c—v 


(3.4.4) equates two differently constructed real-valued functions of the real 
parameter i7.° Since y-x~' and x-u7' are linear, the functions in question are 
linear also and have constant first derivatives. Differentiating both sides of 
(3.4.4) with respect to @ and writing Gy°/du' for 

éy® “un 1 


we have that: 


1/1 1 Nye BO aie 
(. 4 \e ae a (3.4.5) 


-—p ete séeu® dul e—v Cu? 


EINSTEIN'S ‘ELECTRODYNAMICS OF MOVING BODIES’ 59 
Hence 
éy® ane éy°® 
du! °c? —v? du? — 


(3.4.6a) 


Since this result does not depend on the choice of A nor on the sign of c, it will 
not be affected if we lift the restrictions introduced for simplicity’s sake before 
reference number 5. By considering radar signals emitted under similar 
conditions in directions perpendicular to the (¥!, ¥?)-plane and to the (y', ¥°)- 


plane, we infer likewise that 
ay® ay® 
aa F au =0 (3.4.65) 


Since y°-u~! is linear, the system (3.4.6) implies that 


eo. o buh V(x — vx" /e? 
y= a{u 9 “) = o( 1 w/e?) ) (3.4.7) 
where « is an unknown function of v. 

Einstein observes that, according to the RP, the LP applies not only to the 
frame F, as we assumed above, but also to the frame G.” Hence, if o is a 
permutation of {1, 2, 3} and B,,,, is the reflection of a radar signal emitted at 
time 0 from the origin of G, at right angles with the (y*, y°)-plane, in the 
direction of increasing py“, 


yO Bory) = cy?(Baa)) (3.4.8) 


Einstein’s next step is to evaluate y*(B,) in terms ot x(B,) at three events 
B,, B,, and B,, that satisfy the foregoing description for a suitable permu- 
tation co. Take B, first. Substituting from (3.4.7) into (3.4.8), recalling that 
u'(B,) = (c —v)u°(B,), and using (3.4.2), we can see that the following 
relations hold at B,: 


1 pen 
yl ac{ -=— :) i ose te ap? (x! —vx®) (3.4.9) 


where B, as on page 44, stands for | (1 —v?/c?)~'/*|. Turning now to B,, we 
note that u} must take at it the same value as at the emission of the signal 
of which B, is the reflection, so that u'(B,)=0. Hence, by (3.4.2), 
x'(B,) = vx°(B,). Looking at the triangle in R*® whose vertices are the x- 
images of the point of emission and the point of reception of the signal, and the 
perpendicular projection of the latter image on the x'-axis, we verify that 
x?(B,) = x°(B,) ,/(c? — v”). Consequently, (3.4.7) and (3.4.8) imply that, at B,, 


y? = oBx? (3.4.10) 
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By a similar argument one shows that, at B;, 
y? = apx? (3.4.11) 


Einstein introduces the designation ¢(v) for «f, which, like «, is a still unknown 
function of v, and, seemingly oblivious of the fact that equations 
(3.4.9}-(3.4.11) have been shown to hold only at certain special events, he 
collects them, together with (3.4.7), in the following set of transformation 
equations of purportedly general validity: 


y? = g(v)B(x° —vx"/c7) sy! = ev) B(x! — vx®) 


y? = 9(v)x? eo Sie (3.4.12) 


That these equations do actually hold everywhere follows at once from the 
fact that y-x~} isa linear transformation of R*, which is therefore fully deter- 
mined by its values at four linearly independent points. Now we know that 
x!(B,) = vx°(B,), that x!(B,) = vx°(B,), and that the remaining, hitherto not 
explicitly given x and y space coordinates of the three events B, are all zero. 
From (3.4.7) and the initial condition (iii) on page 57 we infer that if E is an 
event such that x(E) = (t, vt,0,0), then y(E) = (at,0,0,0). If t #4 0 and, as we 
assumed at the outset, |v| # c, it is clear that x(E), x(B,), x(B,) and x(B;) are 
four linearly independent points of R* at which we know the values of y- x7 !. 
Starting from these known values it is not hard to show that (3.4.12) gives the 
value of y-x~’ at an arbitrary point of R*.8 

Einstein proves next that the validity of the LP in G—which I shall denote 
by LP(G)—can be inferred from its validity in F—that is, from LP(F }—and 
equations (3.4.12). Some writers object that his procedure is circular, for LP(G) 
had to be assumed in order to prove (3.4.12). But this assumption was made on 
the strength of the RP and LP(F). Hence, the proof that LP(F) and (3.4.12) 
imply LP(G), instead of clashing with it, isa welcome test of the consistency of 
Einstein’s principles. Let E and R be the emission and reception of a light 
signal transmitted through empty space, and write x, for x(E), etc. Then, the 
following is a statement of LP(F): 


D,(x%— xX)? = (xR —x$)?c? (3.4.13) 
By means of (3.4.12) one infers that 

LVR—-VE) = VR-V EC? (3.4.14) 
which is a statement of LP(G). 

In the last stage of the argument, Einstein succeeds in determining the 
unknown function ¢(v). He introduces a new Lorentz chart z, adapted to a 
frame H (our notation). z fulfils with respect to y the same initial conditions 
that y fulfils with respect to x, except that, in condition (iii), —v is to be 


substituted for v. It folows that z-y~! is governed by equations of the same 
form as (3.4.12), with z, y and —v, instead of y, x and v, respectively. 
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Consequently, 
2° = p(—v)B(y® + vy'/c?) = plv)y(—v)x° 
1 ‘ i _ 1 
on g(— BO +vy°) = p(v)p(—v)x (3.4.15) 
z* = y(—v)y = p(v)y(—v)x? 
2? = g(—-v)y* = p(v)p(—v)x? 


Einstein observes that, since the relation between the space coordinates z* and 
x* do not depend on x°, H must be at rest in F. Hence, z:x~! is the identity on 
R*,° and 

p(v)p(—v) = 1 (3.4.16) 


To solve this equation Einstein bids us consider a rod at rest in G, 
whose endpoints P and Q have, oo the Cartesian coordinates 
y'(P) = y?(P) = y?(P)=0 and y*(Q) = y?(Q) =0, ¥7(Q) =A. The rod 
moves in F with speed v in a direction sohendicace to its axis. At time 
x° =f its position in F is given by x!(P,) = vt, x?(P,) = ¥°(P,) =0, and 
x1(Q,) = vt, X7(Q,) = A/@(v), X°(Q,) = 0. Consequently, the distance between 
two simultaneous positions of the rod’s endpoints in F is 4/p(v). Einstein 
equates this distance with “the length of the rod, measured in [ F }”, and adds 
that “from reasons of symmetry it is evident that the length of a given rod 
moving perpendicular to its axis, measured in the stationary system, must 
depend only on the velocity and not on the direction and the sense of the 
motion”.’° Consequently 4/(v) = 4/y(—v) and 


p(v) = g(-v) = 1 (3.4.17) 


Substituting for g and f in (3.4.12) we obtain the Lorentz transformation 
equations for matching charts in their familiar form: 


poke ee x? — jie x! — px? 
a ae ere rr 
V1 =(e7/e?) ¥1—(07/c*) (3.4.18) 
y? = x? yp=x?, 


This completes Einstein’s derivation of 1905. For future reference I shall note 
down here the Jacobian matrix of the transformation y:x7!: 


B —fBvce~? 0 0 
oy’ —Bv B 0 0 ' 
(33 Si ane (3.4.18’) 
0 0 0 1 


A Lorentz transformation governed by equations (3.4.18) is a purely 
kinematic (pk) transformation—or boost—with velocity parameters (v,0, 0). 
Pk transformations with velocity parameters (0, v, 0) and (0, 0, v) can be defined 
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by cyclically permuting 1,2 and 3 in (3.4.18) and in the statement of initial 
condition (iii) on page 57. It can be shown, after some calculation, that the 
equations governing the general pk Lorentz transformation with velocity 
parameters (v,, v2, v3) such that L,v? < c?, are:!! 


= xt 0,(Bx° + — py } (3.4.19) 
(w= 1, 2,3; B = |(1—Z,v7/c?)" 1). 


The typical elements of the Jacobian matrix of this transformation are readily 
seen to be: 


éy® oy? éy# _, dy# vv 
= — ;— ed —_— = -] bedi 3.4.1 ; 
Beco? gh ee ee py Cae ge ee 


Every Lorentz transformation is the product of a pk transformation governed 
by equations (3.4.19) and an nk transformation governed by equations (3.4.1). 
It is not hard to verify that the inverse of a Lorentz transformation and the 
product of two Lorentz transformations are also Lorentz transformations. 
The Lorentz coordinate transformations constitute therefore a group of linear 
transformations of R*, which is a representation of an abstract group called 
the Lorentz group. (It is increasingly fashionable to call it the Poincaré group in 
honour of its actual discoverer; the name Lorentz group is then reserved to the 
homogeneous Lorentz group defined below.) Since Lorentz charts are global, 
for each Lorentz coordinate transformation y- x! between two such charts x 
and y there is a Lorentz point transformation y~ ' - x, which maps the common 
domain of all Lorentz charts (the set of “possible events”) onto itself. The 
Lorentz point transformations are also a realization of the Lorentz group. 
We can obtain a purely algebraic characterization of the Lorentz group by 
means of the following simple argument. If x and y are two arbitrary Lorentz 
charts the transformation )x~' is defined by a system of linear equations: 


yi = ZAyxl +k! (3.4.20) 


The coefficients A,, constitute the Jacobian matrix (dy'/éx’) of the transform- 
ation. We denote this matrix by A. Since yx! has an inverse x-y~', Aisa 
non-singular matrix (its determinant det A # 0). Equations (3.4.20) can be 
written in matrix notation as follows: 


yo Aedh (3.4.21) 


where ke R*. If k = 0, equation (3.4.21) defines the general homogeneous 
Lorentz transformation, between two Lorentz charts that share the same 
origin. As the product of two homogeneous Lorentz transformations is a 
homogeneous Lorentz transformation, it is clear that such transformations 
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form a group, the homogeneous Lorentz group L (4). The full Lorentz group is 
the group generated by (4) and the group of translations (i.e. the additive 
group underlying the vector space R*). Thus we shall reach our goal if we 
succeed in characterizing L(4). 

Since the matrix A of a homogeneous Lorentz transformation is non- 
singular, L(4) is a subgroup of the general linear group GL(4, R), and will be 
represented by a certain family of 4 x 4 real non-singular matrices closed under 
matrix multiplication. We shall now proceed to define this family. Let H;; 
denote the (i, j)-th element (0 < i, j < 3) of the matrix 


0 
= (3.4.22) 
0 


r- COCO 


0 -1 
0 0 
0 0 


A short calculation shows that, if the homogeneous Lorentz coordinate’ 
transformation y-x~' is governed by equations (3.4.18), then 
2, 2,H,,y'y) = c7(y°? —(y'? —(0°? — 0°)? 
= c? (x°)? -.. (x!)? => (x?)? ~_ (x3)? 
= L2H, ;x'x! (3.4.23) 
On the other hand, if y-x~! is governed by equations (3.4.1), with k; = 0, then 
c*(y°)? = c?(x°? and L,L,H, y7y? = EZ pH px?" (a, B = 1, 2, 3), because 
equations (3.4.1) with k° = 0 define an isometry of each Euclidean space 
No = const. onto itself. (The projection function 2, is defined on page 23.) 
Since every homogeneous Lorentz coordinate transformation ts the product 
of a pk transformation governed by equations (3.4.18) and two nk trans- 
formations governed by equations (3.4.1) (page 57), the quadratic form 
2,2 ,H;,x'x’ is obviously invariant under the transformations of the group 


L(4). Let y- x7! be an arbitrary homogeneous Lorentz coordinate transform- 
ation and let A = (A;;) be its Jacobian matrix. We have then that 


£,2,Hjjx'x! = £2, Hy y"y" 
= EEE, LH yp Anx Ag! (3.4.24) 
(i, j,h,k = 0,1, 2, 3) 


Equations (3.4.24) will hold good, as they must, on the entire domain of the 
chart x = (x°, x}, x?, x3) if and only if 


Hy, = 2,2, Ay An Any (3.4.25) 
Or, in other words, if and only if 


H=A'HA (3.4.26) 
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where A? = (A,;) is the transpose of the matrix A = (A;,). We may therefore 
characterize the homogeneous Lorentz group L(4) as the subgroup of 
GL/(4, R) represented by all the non-singular 4 x 4 matrices A that satisfy 
(3.4.26).'? It can be shown that this is a Lie group.'? 
By putting i = j = 0 in (3.4.25) we promptly verify that 
(Ago)? = 14+2,(A,0/¢)* = 1 (3.4.27) 
(a = 1, 2, 3) 

Thus there are two kinds of matrices A € L(4), those with Aj) = 1 and those 
with Ago < 1; and there is no path in L(4) that joins the former to the latter. 
(Transformations of the second kind reverse time order.) Recalling that the 


determinant of the product of several square matrices is equal to the product of 
the determinants of its factors, we infer from (3.4.26) that 


det H = (det H)(det A)(det A”) (3.4.28) 
Hence (det A)(det A”) = 1, and as det A = det A’, 
(det A)? =1 (3.4.29) 


Consequently, if Ae L(4), then either det A = 1 or det A = —1, and again 
there is no path in L(4) that joins these two classes of matrices. 
(Transformations of the second class change a right-handed into a left-handed 
Cartesian system, or vice versa.) Thus the Lie group L(4) splits into four 
topologically disconnected parts, each of which can be seen to be a connected 
component: 


(i) {AEL(4)|detA =1; Ago = 1}: 
(ii) {Ae L(A)Jdet A = 1; Ago <1}; 
(iii) {Ae L(4)|det A = —1; Ago > 1}: 
(iv) {Ae L(4)|det A = —1; Ago S 1}. 


To clarify the meaning of this fourfold decomposition of L(4) let us note 
that the set (i) contains the identity matrix J = (6,,) and constitutes a group in 
its own right. It is in fact a Lie subgroup of L (4) that we shall designate by Yo. 
Like any other subgroup, “, acts on the left on its parent group L(4) by 
(g, hhiogh (ge Lo, he L(4)). The sets (ii), (iii) and (iv) are the orbits, under this 
action of #o, of the time reversal transformation 7, the parity transformation 
P (which reflects each Euclidean 3-space 2) = const. about its origin) and their 
product PT, respectively.'* We see at once that the union of (i) and (ii) and the 
union of (i) and (iii) are two subgroups, known as the proper homogeneous 
Lorentz group and the orthochronous homogeneous Lorentz group, respect- 
ively. Their intersection &, is called the proper orthochronous homogeneous 
Lorentz group. The same adjectives are bestowed on the subgroups of the full 
Lorentz group (or Poincaré group) generated by each of these three subgroups 
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together with the additive group R* (the group of translations). The proper 
orthochronous inhomogeneous Lorentz group, generated by #y and R‘*, will 
hereafter be designated by “. One very important remark that we must make 
before passing to a different subject is that the one-parameter pk transform- 
ations governed by equations (3.4.18) do, but the general pk transformations 
governed by equations (3.4.19) do not constitute a group on their own right. 
This is due to the fact that the product of two pk transformations is again a pk 
transformation only if the two triples of velocity parameters of the former are 
eacha multiple of the other.!* As the reader will recall, the Galilei group differs 
in this respect significantly from the Lorentz group (page 29). The full 
Lorentz group is like the Galilei group a 10-parameter Lie group; each element 
of the identity component ¥ can be labelled by the three velocity parameters 
of the pk transformation, the three parameters of the space rotation and the 
four parameters of the spacetime translation that it involves. 

Einstein’s RP of 1905 is a statement about the laws of nature, or rather, 
about the formulae that express them in terms of coordinate systems of a 
special kind, viz. Lorentz charts. Such formulae represent the properties and 
relations of physical events by properties and relations of the real number 
quadruples assigned to the events by the said charts. The RP says that the 
formulae remain valid when one such chart is substituted for another one, or, 
as we may say, when the formulae are subjected to a Lorentz transformation. 
However, the matter is not quite so simple as it might seem at first sight. One 
cannot just conclude, for example, that if f is a function on R* and h is a 
Lorentz coordinate transformation, then if f= 0 expresses a law of nature, 
jf :h = 0 also expresses a law of nature. It would be fitter to say that if f= 0 
expresses a law, then h, f+ expresses a law, where h, is the transformation 
induced by h in the function space to which f belongs. But, since the meaning 
of h, depends on the nature of /, our statement is imprecise and merely 
suggestive. As we shall see in Chapter 4, one of the great merits of the 
conceptual framework that Minkowski took over from geometry for the 
clarification of Einstein’s theory, is that it provides the means for a systematic 
treatment of this question. 

The RP is sometimes presented in a different way, as a statement about 
physical experiments. Thus, we read that, according to the principle of 
relativity, “no physical experiment can distinguish one inertial frame from 
another”, and that “Einstein’s special principle of relativity can be stated as 
follows: under equal conditions all physical phenomena have the same course of 
development in any system of inertia.” '® The equivalence of both versions of the 
RP follows from the fact that every Lorentz coordinate transformation— 
mapping number quadruples on number quadruples—is matched by a 
Lorentz point transformation—which maps events on events.'’ Hence, the 
same considerations that, on pp. 30f., we found applicable to Newtonian 
relativity, apply mutatis mutandis to Einstein’s RP. If the laws of nature meet its 
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demands, any two physical processes whose determining conditions can be 
described in the same way in terms of two Lorentz charts, must be given 
throughout their development identical descriptions in terms of those 
charts.'® 

There can hardly be any doubt that in the minds of Einstein and his 
contemporaries the Lorentz transformations to which the equations of 
mathematical physics and the physical frames of reference may be subjected 
without prejudice to the validity of the former or to the expected outcome of 
experiments in the latter could have been chosen from the full Lorentz group. 
The fundamental laws of physics, as known to them, were immune to time 
reversal, and they would not have allowed a purely geometric transformation, 
such as parity, to have the slightest physical significance. However, in the light 
of subsequent advances in microphysics, it is advisable to restrict the 
admissible transformations to the proper orthochronous Lorentz group #.'? 
Unless otherwise noted, we shall hereafter proceed on this assumption. We 
shall also assume that the space coordinate functions of our Lorentz charts 
always constitute a right-handed system, and that the time coordinate 
functions never assign a smaller number to a man’s death than to his birth. 


3.5 The Lorentz Transformation. Some Corollaries and Applications 


We cannot dwell here on the many consequences of the transformation 
equations derived in Section 3.4. But we shall examine a few of them, that have 
attracted attention in philosophical debates on Relativity, or that will be useful 
to us later on. 

Any mapping partitions its domain into equivalence classes, namely, the 
fibres of the mapping over each of its values. The fibres of a time coordinate 
function over each real number in its range are called simultaneity classes. 
Equations (3.4.1) imply that the time coordinate functions of two Lorentz 
charts adapted to the same frame determine the same partition of events into 
simultaneity classes. Let x be a Lorentz chart adapted toa frame F. Two events 
E, and E, such that x°(E,) = x°(E,) are said to be simultaneous in F. Let y be 
another Lorentz chart matching x and adapted to a frame G. A glance at the 
first of equations (3.4.18) will persuade us that if E, and E, are simultaneous in 
F they cannot be simultaneous in G unless x'(E,) = x'(E,). Generally 
speaking, two events simultaneous in an inertial frame are simultaneous in 
another inertial frame that moves relatively to the former if and only if the line 
joining their spatial locations is perpendicular to the direction of motion. 

Any real-valued function induces on its domain a relation of partial order, 
determined by the linear order of R. The partial order induced on events by 
time coordinate functions is called temporal order. Temporal order is preserved 
by nk Lorentz coordinate transformations (governed by equations (3.4.1)). Let 
x be as above. If x°(E,) < x°(E,) we say that event E, precedes event E, and 
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that E, follows E, in F (abbreviated E, <,E,). Temporal order is not 
preserved by Lorentz pk transformations (governed by equations (3.4.19)). 
However, if E, and E, occur at the same point of some inertial frame F and 
E, <,£,,E, precedes E, in every inertial frame. On the other hand, if E, and E, 
are simultaneous in some inertial frame F but do not occur at the same point in 
that frame, there are inertial frames in which E, precedes E, and others in 
which E, precedes E,. (Take, for example, any two frames moving relatively to 
F in opposite senses, in the direction of the line joining the points of occurrence 
of E, and E, in F.) Needless to say, in this case E, and E, do not occur at the 
same point in any inertial frame. 

If simultaneity, as determined by Lorentz charts, were not frame dependent, 
a light signal could not propagate with the same speed in all inertial frames, 
and the LP would be incompatible with the RP. For let F and G be two inertial 
frames in motion relative to one another and consider a light pulse transmitted 
through empty space from a point O; at rest in F. Let O, coincide, at the time 
of emission, with the point O, of G. Then, if the RP and the LP are jointly true, 
the wave front takes up in either frame, at each instant, a sphere centred, 
respectively, in O, and in Og. Since O; and Og do not coincide except at the 
time of emission, the preceding statement would be contradictory if each 
instantaneous location of the wave front in F were also an instantaneous 
location of the wave front in G. But due to the frame dependence of 
simultaneity, the events that make up the wave front at a given time in F are not 
simultaneous in G, and hence do not constitute the wave front at any time in G. 
The sphere in F, centred at Or, that the wave front covers at some instant of 
F-time, is not a sphere in G, centred at Og, but neither is it the shape of the wave 
front at any instant of G-time. The same can be said, of course, with F and G 
interchanged. 

The foregoing considerations throw some light on the phenomenon known 
as relativistic length contraction. Let Q be a solid cube at rest in an inertial 
frame F. Choose three concurrent edges of Q and label them 1, 2, and 3, as one 
would the axes of a right-handed Cartesian system. The length of edge a in Fy 
is, of course, the distance between its endpoints in Fy. We denote it by A9(a). 
We assume that Q is not subject to any deforming influence, so that A) (a) has 
the same constant value for all three values of « Let F, be another inertial 
frame moving in Fy with speed v in a direction parallel to edge number 1. The 
length A,(a) of edge « in F, is aptly defined as the distance between two 
simultaneous positions of its endpoints in F,.' For greater definiteness we 
introduce a Lorentz chart x, adapted to Fy, such that the meeting point of the 
three numbered edges of Q lies at the spatial origin of x and that, for any event 
E,, taking place at the other end of edge a, x*(E,) = A (a). Let y be a Lorentz 
chart matching x and adapted to F,,. Choose four events E,((i = 0, 1, 2, 3), such 
that E, is mapped by both x and y on (0,0,0,0) and that E,(a = 1, 2, 3,) is 
simultaneous with E, in F,, and occurs at the other end of edge «. Obviously, 
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y*(E,) = 4,(a). We have that 0 = y°(E,) = B(x°(E,) —x!(E,)ve~*). Hence, 
x°(E,) = Ap(L)vc~?, and 


A, (1) = y'(E,) = B(x! (E,) — vx°(E,)) 

= BAg(1)(1 —v?c~?) 

= B*A,(1) (3.5.1) 
1,(2) = y?(Ez) = x?(E,) = Ao(2) 
A,(3) = y?(E3) = x3 (E3) = Ao (3). 


Consequently, the space taken up by Q at any given moment in the inertial 
frame F,,, in which it moves with speed pv, is smaller than the space it fills in the 
inertial frame in which it is at rest, being shorter than it in the direction of 
motion by a factor of B~' = |(1 —v?c~7)'/?|. As one would expect, this factor 
does not depend on the sense of the motion. When considering this 
phenomenon, one must bear in mind that Q, by hypothesis, is not subject to 
any action that might deform it. Hence, we are not entitled to say that Q is 
contracted by motion. On the other hand, it is true that the place Q occupies in 
its own rest frame differs in both size and shape from the fleeting figure it cuts 
at each moment ina frame moving past it. But surely there is no more need for 
this figure to be cubic than for Q’s shadows to be all square. 

A deeper insight into the changes introduced by Einstein in the received 
views on space and time is provided by relativistic time dilation. Let Fo, F,, x 
and y be as in the preceding paragraph. A natural clock C at rest on the spatial 
origin of x is set to agree with the time coordinate function x°. Hence, it also 
agrees with the time coordinate function y° as it goes through the spatial origin 
of yin F,. But on every other occasion it will not agree with y®. For let E be any 
event at C. Then, by (3.4.18), 


y°(E) = Bx°(E). (3.5.2) 
Hence, if E, and E, are two events at C 
| y°(E,) — y°(Ep)| > |x°(E,) ~ x°(Ea) (3.5.3) 


If F,, is studded with clocks keeping y°-time along C’s path, C will appear to 
run slow when checked against those clocks as it passes by them. By symmetry, 
any clock at rest in F,, and keeping y°-time must appear to run slow when 
compared with clocks at rest in Fy and synchronized with C. (Apply, instead of 
y:x7}, the inverse transformation x: y~!.) No contradiction is involved here, 
for it is impossible to compare two clocks more than once by this method. 
However, if C is transferred all of a sudden to an inertial frame FQ moving 
relatively to F,, in the opposite sense though with the same speed as Fy, C will 
again pass by the clocks at rest in F,, with which it was compared earlier. Since 
B depends on v’, (3.5.2) holds also for y° and a suitably chosen time coordinate 
function x'°, adapted to Fo. If C is not damaged by the sudden bounce it must 
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suffer when changing frames, but continues to behave in Fo as the good 
natural clock that it is, then, when it returns, say, to the spatial origin of y, it will 
definitely run behind the clock C, that keeps y®°-time at that place, and with 
which C agreed as they first passed each other. (It is worth noting that C also 
runs behind any clocks at rest in Fy and keeping x°-time that it meets on its 
way, although it was in distant synchrony with them until it changed frames.) 
Since C has rested on two different inertial frames while C, remained fixed in 
one, symmetry is broken, and one cannot predict that C runs ahead of C,.? 
Nevertheless, it has been judged paradoxical that Special Relativity, which is 
based on the denial of the absolute significance of inertial motion, should 
associate with it such absolute effects as the retardation of C with respect to 
C,.3 Some authors have sought to explain the different behaviour of the clocks 
by noting that C is subjected in the course of the experiment to an accelerating 
force. They apparently believe that this can be held responsible for a physical 
change in C that mere inertial motion is not supposed to cause. But our 
argument was predicated on the assumption that C suffered no significant 
change when we reversed its motion relative to F,,. During its entire trip away 
from and back to C,, C is supposed to keep natural time faithfully in agreement 
with the time coordinate of a Lorentz chart adapted to the frame in which it 
rests. Moreover, the impulse applied to C when changing frames can be 
reduced as much as we wish without diminishing the observed time lag, if we 
correspondingly increase the distance traversed by it in F,, with uniform speed 
between its two encounters with C,. Indeed, we can obviate the need for 
accelerating C if, instead of throwing it from Fy to Fo, we set by it aclock C’ at 
rest in Fo as it goes past C, and allow C’ to carry C’s time back to C,,. In this 
case, no force is exerted, each clock follows undisturbed its own natural 
course—e.g., if they are Langevin clocks, their respective light signals all obey 
the LP—and yet, upon meeting C,, C’ lags behind it by exactly the same 
amount as C did in the former case. I must confess, even at the risk of seeming 
presumptuous, that the air of paradox which, in the eyes of many, enshrouds 
relativistic time dilation arises in my opinion from a persistent yet undeclared 
belief in a unique universal time. If there is but one time encompassing both C 
and C, wherever they happen to be, we cannot consistently maintain that they 
both keep time truly and yet agree on their first meeting and disagree on the 
second. A cause must be found why the one has counted more seconds while 
the other has counted less. But if there is no ubiquitous “while” that both 
clocks are required to measure, it is not surprising that each one, though 
faithfully keeping time at its own place, should not register the same lapse of 
time as the other, from one encounter to the next.* 

Important consequences flow from Einstein’s Rule for the Transformation of 
Velocities, to which we now turn. Let A be an object moving with constant 
velocity in the frame F,. What is A’s velocity in F)? We use again the Lorentz 
charts x and y that we introduced above, and let Ax' and Ay! denote the 
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increment of the i-th coordinate function of either chart between any two 
events at A. Clearly, the Ay! are given as functions of the Ax' by equations 
(3.4.18). (Prefix A’s to all x’sand y’s.) Write u, for Ax?/Ax°® and w, for Ay*/Ay®, 
the «-th velocity components of A in terms of x and y. It readily follows from 
(3.4.18) that 

wyto W2 W; 


Tew) 2 BAF w/e)" BOA + ow, /e)) 
(3.5.4) 


Setting £,(w,)? = c’, one verifies, after a short calculation, that £,(u,)? = c?. 


Hence, if A is a light pulse transmitted in vacuo, it moves with speed c in both 
F, and F,,, as indeed it must, if the RP and the LP hold good. But since the 
same can be said if A is any object whatsoever moving in either frame with that 
speed, one surmises that it is not so much the nature of light, but rather some 
structural feature of space and time, that is responsible for the peculiar status 
of speed c. Suppose, on the other hand, that F,, is the rest frame of a medium of 
refractive index n moving across the laboratory frame Fo. If A is a light pulse 
sent across that medium, in the direction of its motion, we have that w, = c/n, 
and w, = w; = 0, so that vu, = u, = 0, and 


_ (c/n)t+v _ 1 cy 
uy, ~ a mE no(t-3 4) (3.5.5) 


This result agrees with Fresnel’s formula for the speed of light carried by a 
moving refractive medium (page 42) if v <c. Being a direct consequence of 
the kinematic rule for the transformation of velocities from one inertial frame 
to another, it need not be explained by physical hypotheses, such as Fresnel’s 
aether drag, or the complex mechanism of matter-light interaction proposed 
by Lorentz.° Einstein’s Rule (3.5.4) also yields improved equations for the 
aberration of starlight and the Doppler effect. 

The main task of Relativity physics is to formulate the laws of nature in a 
manner consonant with Einstein’s RP. Classical electrodynamics met this 
demand splendidly, as Einstein showed without difficulty.® But classical 
mechanics did not, and a new, relativistic mechanics was developed in the next 
few years by Einstein himself, Planck, Lewis and Tolman, and others.’ We 
shall have something to say about it in Chapter 4, after our study of 
Minkowski spacetime gives us the means to see more clearly into the subject. 
There is, however, one relativistic law that we ought to mention beforehand, 
due to its importance for the theory of spacetime. In relativistic mechanics, the 
energy of a material particle moving in an inertial frame increases beyond all 
bounds as its speed approaches the speed of light in vacuo. Since all sources of 
power available for accelerating a particle are presumably finite, it follows that 
no particle moving with speed v < c in an inertial frame can ever attain the 
speed c in that frame. If we postulate further that every particle must have a 


EINSTEIN'S ‘ELECTRODYNAMICS OF MOVING BODIES’ 71 


definite real-valued energy in each inertial frame, it is impossible that a particle 
at rest in any such frame should move with speed v = c in another one. If we 
regard inertial frames as being at least potentially the rest frames of material 
bodies, it follows at once that no two inertial frames can move relatively to 
each other with a speed greater than that of light. Such a restriction on the 
relative speeds of inertial frames of reference was inconceivable in classical 
mechanics because, under the Galilei Rule for the Transformation of 
Velocities, every bound imposed on relative speeds is surpassed for a suitable 
choice of frames. But the restriction is in perfect harmony with Einstein’s Rule, 
by which, if F, G and H are three inertial frames such that G moves in F and H 
moves in G with any speed less than c, H moves in F witha speed less than c.® 
Thus, the fact that the general pk Lorentz transformations (3.4.19) are 
undefined if £,v2 > c? (w = 1, 2, 3) does not impair the usefulness of Lorentz 
charts.’ Again, since a rod or clock cannot travel across an inertial frame with 
the speed of light, there is no paradox involved in the prediction, implicit in 
(3.5.1) and (3.5.2), that at speed c a rod’s length would shrink to zero and a 
clock would stop completely, in comparison with rods and clocks at rest in the 
frame. It is worth emphasizing that the speed c, relative to an inertial frame, is 
an unattainable maximum only for other such frames and for material systems 
at rest in them. By the Einstein Rule, an object travelling with speed c in one 
inertial frame, moves with the same speed in all the others. And nothing in 
Special Relativity precludes the existence of particles moving faster than light 
in all inertial frames. Such particles, known in the literature as tachyons (from 
the Greek word tachys, meaning fast), have been the subject of lengthy 
theoretical discussions, but have not hitherto been detected.!° Within the 
framework of Relativity physics, they would have to be assigned an imaginary- 
valued energy. No braking agency could ever slow one of them down to speed c 
in any inertial frame. Moreover, they could not—lest they be used for 
deliberately changing the past—act as information carriers, and would 
therefore interact with the more familiar forms of matter only in a totally 
random fashion (if at all). The speed of light in vacuo thus possesses, according 
to the Special Theory of Relativity, a fundamental physical significance, which 
does not depend on any of the conventions leading to the definition of Lorentz 
charts: it is the maximum signal speed. As a consequence of this, a relativistic 
world cannot contain a true rigid body, i.e. a material system whose parts are 
constrained by internal forces to stay at a perfectly constant distance from one 
another; for, if such a body existed, one could send a signal across it with 
infinite speed just by pulling or pushing at one of its extremes. 


3.6 The Lorentz Transformation. Linearity 


As I announced on page 57, we shall now examine Einstein’s claim that the 
coordinate transformation between two Lorentz charts with a common origin 
must be linear “on account of the properties of homogeneity which we 


72 RELATIVITY AND GEOMETRY 


attribute to space and time”. To avoid misunderstandings let me point out that 
the term “Lorentz coordinate transformation” can be used in a physical and a 
purely mathematical sense. The physical sense was fixed on page 56 by means 
of the physically defined notion ofa Lorentz chart, that is, a coordinate system 
arising, in the manner described on page 54, from a set of Cartesian 
coordinates on the relative space of an inertial frame and a time coordinate 
established on the same frame by Einstein’s method (page 52). If x and y are 
Lorentz charts, the composite mapping x-y~ ', which sends a real number 
quadruple to another, is a Lorentz coordinate transformation in the physical 
sense. The transformation is orthochronous if the time coordinates of both 
charts increase together (ie. if 6x°/dy° > 0). It is proper if the underlying 
Cartesian systems are both right-handed, or both left-handed. It is homoge- 
neous if both charts have a common origin, so that x: y~'(0) = 0. Ina purely 
mathematical sense a Lorentz coordinate transformation is simply a member 
of the (full) Lorentz group of transformations of R*. It is orthochronous, 
proper or homogeneous if it belongs to the subgroups of the Lorentz group 
that bear these names (p. 64). A Lorentz coordinate transformation in the 
mathematical sense is linear by definition. What Einstein is saying in the 
passage quoted is that, due to the homogeneity of physical space and time, 
linearity is also a property of homogeneous Lorentz coordinate transform- 
ations in the physical sense.’ As we saw in Section 3.4 this linearity is an 
indispensable premise of his proof that both senses of the term are coextensive. 
We must now try to understand how it can be said to rest on the homogeneity 
of space and time. 

We say that an arbitrary set S is homogeneous with respect to a group G if G 
acts transitively on S.? All the elements of S are then equivalent with respect to 
the action of G. If S has a definite structure, we call it a homogeneous space if it 
is homogeneous with respect to its group of automorphisms (i.e. the group of 
structure-preserving permutations of S). Affine spaces are therefore homo- 
geneous par excellence, for an affine space is by definition a set transitively and 
effectively acted on by the additive group underlying some linear space, and its 
affine structure consists precisely of the properties and relations invariant 
under the action of this group. The Einstein time and the Euclidean space of an 
inertial frame are of course affine spaces, transitively and effectively acted on 
by the additive groups of R and R?, respectively; and Einstein was right in 
calling them homogeneous.’ But it is not at all clear that the linearity of the 
Lorentz transformations can follow from this fact alone. 

Let me observe, first of all, that the structural properties that a manifold may 
possess—beyond the one by which, praecise sumpta, it is a manifold—do not 
bestow any peculiar mathematical qualities on the coordinate transformations 
between its charts. Thus, even if a given manifold M is an affine space, a 
coordinate transformation between two charts of M might not be linear if one 
of them is a curvilinear coordinate system (i.e. if its parametric lines are not 
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straight in the affine space M). The structure of the manifold can in fact tell us 
something about the coordinate transformation only insofar as the charts 
involved reflect that structure. The following condition, however, is sufficient 
to ensure the linearity of the coordinate transformations x:y~} and y-x7? 
between two charts x and y of an n-manifold M :M is an affine space, acted on 
by R", and x and ) are linear isomorphisms of the vector spaces (M, x~' (0)) 
and (M, y'~ 1(0)), respectively, onto R".4* Do homogeneous Lorentz coordinate 
transformations automatically meet this condition? 

In order to answer this question let us consider two inertial frames F, and F, 
and let S; denote the Euclidean space of F;(i = 1,2). We shall designate by 7; 
the fibres of an Einstein time coordinate function adapted to F;. (In other 
words, T, is the quotient of the set of “possible events” by Einstein simultaneity 
in F;.) Let x; be the Lorentz chart arising from a Cartesian system x; defined on 
5, and an Einstein time coordinate x? defined on 7;. We wish to know whether 
x,°x}7! isa linear permutation of R*. As is well-known (see note 4), the mere 
choice of the charts x, and x? makes S, and 7, into vector spaces with their 
zeroes, respectively, at the origins O§ = x; '(0)and OF = (x°)~'(0) of the said 
charts. X, and x? are then linear isomorphisms of (S;, 08) and (7;, 07) onto R? 
and R, respectively. But this does not by itself guarantee that the domain of the 
Lorentz charts x, and x, is an affine space, nor that x, ‘x; ‘ isa linear mapping. 
For the domain of x; is neither S; nor 7; but their Cartesian product 7; x S,, 
and, in general, S, # S, and 7, # T,. Wecan indeed view (7, x S;, x; '(0)), in 
a natural way, as the direct sum (7;, 07 )@(S;, O8) of the vector spaces (7;, O/ ) 
and (S;, 05), whereby the Lorentz chart x; becomes a linear isomorphism of its 
domain onto R*.° And we do of course identify the domains of x, and x, by 
equating each pair (t,, P,) in the former with the pair (t,, P,) in the latter, 
where P, is the point in S, with which P, coincides at time t,, and ¢, is the time 
in 7, at which P, coincides with P,. (It is by virtue of this identification that all 
Lorentz charts can be said to share the same domain, to affix numerical labels 
to the same physical world.) If the coordinate systems x? and X; are so chosen 
that (07, O$)is equal in this sense to (OJ, O$), the Lorentz charts x, and x, can 
be meaningfully said to have the same origin. But it is not self-evident that the 
sufficient condition of linearity stated above is thereby fulfilled. It is true 
indeed that each Lorentz chart x,(i = 1, 2) is defined on an affine space which it 
maps isomorphically onto R*. And we have successfully identified the set 
T, x S, underlying one of those affine spaces with the set 7, x S, underlying 
the other. But the correspondence by which the identification was effected did 
not depend on the affine structures bestowed on those sets, and we have yet to 
show that it preserves them. In other words, we would still have to prove that 
the bijective mapping of 7, x S, onto T, x S, defined above by means of 
physical coincidences is actually a linear isomorphism of (7,,07)@(S,, 0%) 
onto (7, 07)@(S;, 03). But rather than grapple with the problem of proving 
this, we shall proceed directly to our main goal of establishing the linearity of 
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the Lorentz coordinate transformation x, +x z', from which the isomorphic 
character of the said bijective mapping easily follows. Our goal can be 
promptly attained if it is true that the additive group of R* = R@R° acts 
transitively and effectively on the common domain of the Lorentz charts x, 
and x,, as defined by the above identification; in other words, if not only the 
Einstein times and the Euclidean spaces belonging to each inertial frame, but 
also the common spacetime constructed from them can be said to be 
homogeneous. This, however, does not follow from the homogeneity of the 
spaces S, and S, and the times 7, and T; alone, but also depends on the way 
how the inertial frames F, and F, move in each other. 

In order to understand why the linearity of the Lorentz transformations is 
thus essentially a question of kinematics, and not just one of geometry (in the 
popular sense) and chronometry, let us consider an example drawn for 
simplicity’s sake from the realm of Newtonian theory. Let F, and F, stand for 
two rigid frames, one of which is inertial while the other one rotates in it with 
constant angular velocity. Let S; stand as before for the Euclidean space of the 
frame F; and set T, = T, = T, the Newtonian time common to all frames. We 
identify the sets 7, x S, and 7, x S, by the same physical criterion employed 
above; i.e., we set (t, P,) = (t, P,) whenever P, eS, coincides with P,€S, at 
te T. Wechoose four points O/ € T;, OF € S;, such that (07, O$) = (OF, 08). We 
construct the linear spaces (T7;,07)@(S,, 05). We establish the Cartesian 
coordinate systems y;: S; > R° and the isometric time coordinate functions y?: 
T;, > R, so that y?(07) = 0, y,(O$) = 0; and define from them, in an obvious 
way, the linear isomorphisms y;: (7;, 0',)@(S;, O',) > R*. Though y, and y, 
meet the same conditions explicitly considered in the above discussion of the 
Lorentz charts x, and x,, it is evident that y, - yz | cannot bea linear mapping.® 
This is due of course to the kinematic relation between the frames F, and F;, 
which is quite different from that which obtained in the previous case, when 
the two frames were assumed to be inertial. Although in each case there are 
homogeneous times and spaces associated with the frames, we can combine 
them into a single homogeneous spacetime only if the frames F, and F, are 
both inertial. To see this let F, alternately rotate and move inertially in the 
inertial frame F,, and consider the motion of an arbitrary point Pe F,. If F, is 
rotating the speed of P in F, will depend on its distance from the axis of 
rotation, and the direction of its motion will vary with time. But if F, is inertial, 
the velocity of P in F, will be the same at all times, for every Pe F,. Thus all 
points of an inertial frame F, stand at all times in one and the same relation to 
another inertial frame F,. This is the “homogeneity property” of space and 
time from which the linearity of the Lorentz transformations actually follows. 
For, if all points of F, always move in F, with the same velocity, the 
homogeneous Lorentz coordinate transformation x, - x, '—in contrast with 
the non-linear transformation y,- yz '—must be indifferent to the choice of 
the particular origin (07, 03) = (OJ, O$) shared by the Lorentz charts x, and 
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x,, and must remain invariant if that origin is changed. Let k and k’ stand, 
respectively, for the values of x, and x, at the new origin, and, for brevity’s sake, 
set x,°x,' =f. Clearly, k’ = f(k). Consequently, the invariance requirement 
stated above implies that, for any ve R*, 


Sot+k)=flv)+k =f) +f k) (3.6.1) 


This relation holds for any change of origin and thus for any ke R*. From 
(3.6.1) it readily follows that for any ve R* and any rational number 4, 


Sf Av) = Afr) (3.6.2) 


If we assume that f is continuous at 0, (3.6.2) holds also for any 4€R.’ 
Equations (3.6.1)}—for arbitrary v, ke R*—and (3.6.2}—for arbitrary ve R‘, 
4¢€R—say in effect that fis a linear mapping.® 

The linearity of Lorentz coordinate transformations can also be estab- 
lished on physical grounds, as follows. Let x and y be two arbitrary Lorentz 
charts. We assume that Einstein’s Relativity Principle holds good. We have 
then that: 


(A) According to the Principle of Inertia, the set of events in the history of 
any free particle are mapped by both x and y onto a set of linearly 
dependent real number quadruples. 

(B) According to Einstein’s Light Principle, the images by x and y of the 
events in the history of a light signal satisfy the equations: 


7 (x9 = X°)? —E,(x6 — 7 = C7 (9 — YP — EG 37 = 0 (3.6.3) 


where (xj) and (1) are the coordinates in each chart of a fixed event 
in that history (i = 0,1,2,3; @ = 1, 2,3). 


If we assume moreover that all Lorentz charts are defined on a common 
domain which they map injectively onto R*, it follows that: 


(C) Lorentz coordinate transformations are permutations of R*. 


The linearity of the transformations x-y~' and y-x~! follows from 


(A) and (C), and also from (B) and (C). Let us now disregard (C) and make 
allowance for the possibility that the transformations send some points 
of R* to infinity. Then (A) entails that they must be projectivities of the 


general form 

: L.a;. xi +a; 

i yee Fi t 

{= (3.6.4) 
x jbjx7 +b 


with (a,,) a non-singular matrix (a, 8 = 1,2,3), and the other coefficients 
arbitrary. (B) implies, on the other hand, that the transformations must belong 
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to the group generated by the transformations of the form 
yi = Ljea,,x/ +4; (3.6.5) 


where eg = 1, e; =e, =e; = —1, the a,, satisfy the condition 
© e,4,;4;, = e;4, and the a, are arbitrary; and the so-called Mébius transform- 
ations, of the form 


i_ iy2 
x! — a; ;e;(x’) 


~ 1-22,e,a;x/ + L;Z,e,a7 e, (x*)? 


y 


(3.6.6) 


where the a; are arbitrary and the e; are as in (3.6.5). The intersection of the 
group of projectivities (3.6.4) and the group generated by the transformations 
(3.6.5) and (3.6.6) comprises only the special linear transformations (3.6.5) and 
is in effect none other than the Lorentz group. (The reader ought to satisfy 
himself that equations (3.6.5) under the conditions stated above agree exactly 
with equations (3.4.20) subject to conditions (3.4.25).) Thus (A) and (B) jointly 
entail that the Lorentz coordinate transformations in the physical sense 
coincide with the Lorentz coordinate transformations in the mathematical 
sense.” 


3.7 The Lorentz Transformation. Ignatowsky’s Approach 


The Russian mathematical physicist W. von Ignatowsky (1910, 1911a) 
claimed that the Lorentz transformation equations can be derived from the 
Relativity Principle alone, without assuming anything about the velocity of 
light or any other specific phenomenon. The constant c which occurs in the 
equations is then demonstrably the least upper bound of the physically 
possible relative speeds of ordinary matter (i.e. the kind of matter that may be 
found at rest in an inertial frame). In fact Ignatowsky used several additional 
explicit and implicit hypotheses besides the Relativity Principle. But they are 
hypotheses of a purely geometrical or chronogeometrical nature, and do not 
concern phenomena of a special kind. By relaxing them we can state the 
general problem of possible relativistic kinematics, first envisaged and fruit- 
fully dealt with by H. Bacry and J. M. Lévy-Leblond (1968). Ignatowsky’s 
work has been repeated and refined by numerous authors, some 
of whom independently lighted on the same ideas, unaware that they had long 
been available in well-known journals.! 

By the Relativity Principle Ignatowsky understands the equivalence of 
inertial frames. Each such frame F is associated with a Euclidean space Sf, in 
which it is at rest, and with a universal time 77, i.e. a unique partition of all 
events in S,; into simultaneity classes which form a linearly ordered metric 
continuum. If F and F’ are two inertial frames each point in S,. describes a 
straight line in S;, traversing equal distances in equal intervals of 7;. As we 
know, this requirement restricts the choice of the possible partitions 7; but 
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does not fix it uniquely. But Ignatowsky gives no further indications on this 
matter—no “rules for clock-synchronization”—for he apparently expects that 
for each inertial frame F the time 7; can be determined by comparison with 
other inertial frames, in the light of the Relativity Principle. Given any inertial 
frame F, every physical event can be described at an occurrence in the space S; 
at some moment of the time 7;. It is assumed that the laws of nature are 
invariant under rotations and translations in S; and under translations of Tp. 
The equivalence of inertial frames entails that they must also be invariant under 
appropriate transformations that switch their reference from one inertial 
frame to another. The aim of Ignatowsky’s inquiry is to establish the form of 
these transformations, 

The problem is made mathematically definite by choosing units of time and 
length and mapping S; isometrically onto R* and 7; isometrically onto R, for 
every inertial frame F. A coordinate system (t, x, y, z), where (x,y,z) is a 
Cartesian system on S; and t is an isometric coordinate function on 7; will be 
said to be adapted to F. (We do not call (t, x, y, z)a Lorentz chart because we do 
not know as yet whether the time 7; agrees with Einstein time.) Let F and F’ be 
two inertial frames such that F’ moves in F with velocity vy. Let (t, x, y, z) and 
(t', x’, y', 2’) be coordinate systems adapted to F and F’, respectively. The 
systems can obviously be so chosen that they both map the same event to 0, 
that their time coordinates increase together and that the homonymous space 
coordinate axes point in the same directions at time t = t' = 0. A simple spatial 
rotation of each system is sufficient to ensure that the (x, x’)-axis lies along the 
direction of motion. As on page 57, let us say that the systems (¢, x, y, z) and 
(t', x’, y’, 2’) are matched together when they satisfy these conditions. The 
velocity of F’ in F is then equal to its component along the x-axis; we shall 
designate it henceforth by v. Obviously, the transformation between the 
matching systems (t, x, y, z) and (t’, x’, y’, 2’) must be uniquely determined by 
this velocity v, for, as we observed on page 74, the relation between the two 
frames is fully expressed by it.?, Ignatowsky believed that he had proved from 
the Relativity Principle that the transformation between two matching 
coordinate systems such as we have described is the Lorentz pk transformation 
with velocity parameters (v, 0, 0}—unless the constant of nature that turned up 
in his calculations were equal to 0, in which case the transformation would be 
the Galilei pk transformation with the said velocity parameters. Ignatowsky's 
proof requires, however, two additional premises, one of which he mistakenly 
declared to be a corollary of the Relativity Principle. The other he failed to 
mention, probably because he perceived its denial as physically untenable. 

The assumptions made hitherto imply that the transformation ts linear. As 
we saw in Section 3.6, this is a consequence of the structure of the spaces and 
times of F and F’ and the nature of the motion of F’ in F.> Hence, if the 
velocities of a set of inertial frames are collinear (i.e. if they point either in the 
same or in opposite directions) in a given inertial frame, they are collinear also 
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in every inertial frame. Thus if (¢, x, y,z), (t’, x’, y’, 2’) and (t’, x”, y, 2”) are 
coordinate systems adapted, respectively, to three inertial frames F, F’ and F” , 
and such that the first system is matched with the second and the second is 
matched with the third, the first system is also matched with the third. Since the 
transformation from the unprimed to the doubly primed system is obviously 
the product of the transformation from the unprimed to the simply primed 
system and the transformation from the latter to the doubly primed one, it is 
clear that the transformations between matching coordinate systems adapted 
to inertial frames with collinear velocities constitute a group. We call this 
group the pk transformation group of inertial frames with collinear velocities, 
and denote it—until the end of this Section—by G. Evidently, Ignatowsky’s 
claim can be immediately verified or refuted if one knows the structure of the 
group G. The following remarks will facilitate its study. 

Consider the set of all inertial frames moving in either direction of a given 
straight line in an inertial frame Fy. The set can be indexed by the velocities 
with which each frame moves in Fy. The index set—i.e. the set of velocities with 
which an inertial frame can travel in a particular direction in a given such 
frame—is a set of real numbers which evidently includes 0. We may reasonably 
assume that if it contains any number v # 0, it also contains every real number 
between 0 and v. The index set, which we shall denote by J, is therefore a real] 
interval, possibly identical with R itself; unless it is equal to RU {oc }—i.e. 
unless there is an inertial frame moving with infinite speed in Fo, a possibility 
we may just as well countenance for the time being to show that we are open- 
minded. As we noted earlier the transformation from the first to the second of 
two matching coordinate systems adapted respectively to Fy and to another 
frame F,, in our set is uniquely determined by the velocity véJ. There is 
therefore a bijective mapping h of J onto G, which assigns to each ve! the 
transformation h(v)éG determined by it. The mapping h induces a group 
structure in I. If v, we J, the group product v * wis equal to h~* (h(w)- h(v)). We 
call the group (J, * ) the velocity group of collinearly moving inertial frames. It is 
evidently isomorphic with G and thus a realization of G in the set J of possible 
relative velocities between such frames. By definition v * w is the velocity in 
F, of the frame of our set that moves with velocity w in F,. Thus the 
group product (v,w)h+v*w is none other than the Rule of the Trans- 
formation of Velocities determined by the group G. Ignatowsky’s claim will be 
fully substantiated if we can show that the group product of the velocity group 
agrees with the Einstein Rule 


(3.7.1) 


degenerating to the Galilei Rule 


vew=v+w (3.7.2) 
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if the constant c~? that occurs in the right-hand side of (3.7.1) is equal to 0. 
We observe that the neutral element of the velocity group (J, * } is 0. Hence 

(3.7.1) or (3.7.2) can only hold good if the inverse of each ve! is —v. This 

group-theoretical condition amounts to the following physical 


Principle of Reciprocity. If the inertial frame F’ moves in the inertial frame 
F with velocity v, F moves in F’ with velocity —v.* 


Ignatowsky maintains that the Reciprocity Principle is a consequence of the 
Principle of Relativity. But he is wrong. The Relativity Principle implies that 
the velocity v’ of F in F’ is the same function of the velocity v of F’ in F as vis of 
v’. This is so if and only if each element v of the velocity group (J, *) is the 
inverse of its inverse. But this trivial group-theoretical truth cannot by itself 
teach us anything about the value of v’s inverse.° Berzi and Gorini (1969) 
derived the Reciprocity Principle from the following: 


Principle of Spatial Isotropy. The transformations of the group G 
commute with the permutation of R* that sends (1, x, y, z) to (t, — x, y, z). 


The Isotropy Principle means that if a given feG transforms a coordinate 
system (t, x, y, z) adapted to a frame F into the coordinate system (t’, x’, y’, 2’) 
adapted to a frame F’, the same feG transforms the coordinate system 
obtained from the former by the mapping x ++ —x into the coordinate system 
obtained from the latter by the mapping x't> —x’. At first sight, one may 
think that the Isotropy Principle follows directly from the twofold fact that the 
relative spaces of the frames F and F’ are Euclidean and that, according to the 
definition of matching coordinate systems, the x-axis and the x’-axis coincide 
and are similarly oriented at all times. However, the reader who has learnt the 
lesson of Section 3.6 will promptly perceive that the Isotropy Principle has a 
kinematical import which cannot rest exclusively on the spatial geometries of 
the frames. Though each geometry remains unaffected by spatial reflection, 
this cannot guarantee the invariance of the kinematic linkage between the 
frames. After all, a space in which a direction of motion has been singled out is 
by this very fact kinematically anisotropic. Berzi and Gorini show that the 
Principle of Spatial Isotropy is equivalent to the following undeniably 
plausible but manifestly kinematical hypotheses: 


(i) Ifvel, —vel. 
(ii) If F and F’ are two inertial frames such that F’ moves with velocity v 
in F, then the following quantities depend on the magnitude, but not 
on the direction of v: 
(a) The ratio between the length in F’ ofa rod at rest in that frame and 
the distance between two simultaneous positions of its endpoints 
in F. 
(b) The ratio between the duration in F’ of a process taking place at a 
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point of that frame and the time interval in F between its 
beginning and its end. 

(c) The ratio between the duration in F of a process taking place at a 
point of that frame and the time interval in F’ between its 
beginning and its end.° 


Let v’ denote the inverse of v in (/, * ). From the Isotropy Principle, Berzi and 
Gorini easily infer that the mapping that sends each element of J to its inverse 
is an odd function, i.e. that 

(-v)/ = -0’ (3.7.3) 


By assuming, moreover, that / is entirely contained in R (i.e. that oo ¢/) and 
that the mapping viv’ is continuous, they show that in effect:’ 


v= -—v (3.7.4) 


The Reciprocity Principle (3.7.4) plus Berzi and Gorini’s postulate that 
I c Rare the only additional premises that we need in order to prove (3.7.1). 
(One may surmise that Ignatowsky did not mention the second one because he 
would not countenance infinite relative velocities between inertial frames.) The 
typical transformation of G, from a coordinate system (t, x, y, z) adapted to a 
frame F toa matching coordinate system (t’, x’, y’, z‘)adapted toa frame F’ can 
be written as follows: 


U = Agolt + Agi X + Aor2V + Ao32 


X= Ayot +a, ,X + a,.y+a,32 
10 11 12¥ 13 (3.7.5) 
Y = Agol + 4_,X + Az2y + A732 


Z' = Ayot + 3,X+432V+4335Z 


Since the only significant spatial direction is that of the (x, x’)-axis—ie. the 
direction of the motion—the transformation must commute with rotations 
about this axis. This implies that a. = a33,€23 = —432 and ap. = ayq = ao3 
= A349 = G47 = Az, = 4,3 = 43, = 0.8 Since the y’-axis and the z’-axis lie at 
time t = 0 respectively on the y-axis and on the z-axis, a,, = 0. The non-zero 
coefficients depend only on the velocity v with which F’ moves in F. We may 
therefore rewrite (3.7.5) as follows:? 


t' = p(v)t + v(v)x y = Av)y 


x’ = a(v)t + B(v)x 2’ =A(v)z 78a) 


where y, v, a, 8 and / are five real-valued functions on I. If F” is a third frame 
moving collinearly with the other two, whose velocity in F’ is w, the 
transformation from (t’, x’, y’, 2’) to the matching system (t”, x”, y”, 2”) adapted 
to F” is given by: 


t” = p(w)t' + v(w)x’ y’ = Aw)y’ 


(3.7.66) 
x” = a(w)t’ + B(w)x’ 2" = A(w)z’ 
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Let u be the velocity of F” in F. Then y” = A(w)y' = A(w)d (v)y = A(u)y. 
Whence A(u) = A(w)A(v). If w = —v, then, by the Reciprocity Principle, u = 0, 
and we have that A(—v)A(v) = 1(0) = 1. By the Relativity Principle, y” is the 
same function of (t’, x’, y’, 2’) as y’ is of (t”, x”, y”, 2”). Consequently, 
A(-v) =A(v) = 1 (3.7.7) 
We also note that, since v is the velocity of F’ in F, if x’ = 0, x = vt. Hence 
a(v) = —vB(v). Thus, equations (3.7.6) can be written as follows: 
t= t+ fo= 
H(v)t + v(v)x y y (3.7.8a) 
x’ = B(v)(—vt+x) Zo. eg 
t” _ L(w)e’ + v(w)x’ y” eens y ; se 
x” = B(w)(—wt' +x’) 2" = vAA ( ates ) 


Likewise the transformation from the unprimed to the doubly primed system 
is given by: 
t” = p(utt+v(u i 
H(u)t + v(u)x y"=y (3.7.80) 
x” = B(u)(—vt + x) Zz! =z 


Equations (3.7.8) are consistent only if 


| h(u) i = | H(v) — v(v) | H(w) — v(w) | 
—uB(u) B(u) —vB(v) Br) —wB(w) B(w) 
= | L(v)u(w) — wr(v)B(w) H(v)v(w) + v(v)B(w) | 
— vB(v)u(w) — wB(v)B(w)  — vB (v)v(w) + B(v)B(w) 
(3.7.9) 
If we set v = —w, then u = 0. Substituting these values into the lower left 
corner of the first and the last matrix in (3.7.9), we have that 0 = wB( —w)u(w) 
—wB(—w)B(w), whence 
L(w) = B(w) (3.7.10) 


Substituting from (3.7.10) into the upper left corner of the matrices in (3.7.9) 
and equating it with the lower right corner, we verify that 


B(v)B(w) —wv(v)B(w) = — vB(v)v(w) + B(v) B(w) (3.7.11) 
whence obviously 
v(v) v(w) 
~ eBoy WB) 


where k is a constant. The lower line of the matrix equation (3.7.9) now reduces 
to: 


(3.7.12) 


uB(u) = (v + w)B(v)B(w) (3.7.13a) 
B(u) = (1 + vwk)B(v)B(w) (3.7.13b) 
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Dividing (3.7.13a) by (3.7.13b) we obtain: 


v+w 


= (3.7.14) 
1+vuwk 


u=vU*W 


We see thus that k must have the dimension of the reciprocal value of a velocity 
squared.'° 

If k = 0, we can set k = 1/c?, with c a constant with the dimension of a 
velocity. (3.7.14) becomes then the Einstein Rule: 


ee ea (3.7.1) 


degenerating to the Galilei Rule (3.7.2) if c~? = 0. Thus the group G is the one- 
parameter group of Lorentz pk transformations if k > 0, and the one- 
parameter group of Galilei pk transformations if k = 0. In either case, the set J 
of admissible relative velocities of collinearly moving inertial frames is equal to 
the open interval (—c,c). If k = 0, ] =R. If k > 0, c is finite. It is then the 
unattainable least upper bound of the speeds with which an inertial frame— 
and hence an ordinary material particle—can move relatively to another 
one.’! 

On the other hand, if k < 0, 1 = RU {00}, as the reader can easily verify.'? 
The group (J, «) has then the topology of the projective line. The Rule of the 
Transformation of Velocities (3.7.14) can yield u = oo even when v and w are 
finite. It can also yield a negative u for positive v and w. Of course, k cannot be 
negative if J < R, i.e. if no inertial frame moves with infinite speed with respect 
to another one—as Berzi and Gorini had to assume in order to prove the 
Reciprocity Principle. Moreover, as Lalan showed,'* the hypothesis that k < 0 
is incompatible with the following: 


Chronology Principle. If an event E, takes place before an event E, ata 
given point of an inertial frame, E, takes place before E, in every inertial 
frame. 


Denial of this Principle would entail that a travelling mother might be able to 
see her sedentary son become younger and younger until he is ready to creep 
back into her womb, while she ages steadily at her usual pace. If we think that 
this is too far-fetched to be countenanced even as a remote possibility, we must 
abide by the Chronology Principle and conclude that k > 0. 

I must finally emphasize that if k is positive, the limit speed ¢ = 1/ /k isa 
constant of nature. One is free indeed to establish a coordinate system linked 
to a given inertial frame F, in terms of which the limit speed of inertial motions 
relative to F is different in different directions. But such a coordinate system 
would not reflect the space-time structure implied by the Relativity Principle, 
the Chronology Principle and the Principle of Spatial Isotropy. 
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3.8. The “Relativity Theory of Poincaré and Lorentz” 


In his History of the Theories of Aether and Electricity, Sir Edmund 
Whittaker describes Einstein’s (1905d ) as “a paper which set forth the relativity 
theory of Poincaré and Lorentz with some amplifications, and which attracted 
much attention”! Though Whittaker’s views on the origin of Special 
Relativity have been rejected by the great majority of scholars,” an exami- 
nation of their factual grounds can further illuminate the core of Einstein’s 
theory and bring out the true nature of its novelty. Whittaker’s case rests on 
two sorts of evidence. There are some remarks made by Poincaré as early as 
1895 or 1898, which definitely anticipate Einstein’s conception of the Relativity 
Principle and may have suggested his criticism of simultaneity, but which do 
not, by themselves, constitute a physical theory. Then, there is the new theory 
of “electromagnetic phenomena ina system moving with any velocity less than 
that of light’’, proposed by Lorentz in 1904 and perfected by Poincaré in a 
paper submitted to the Circolo Matematico di Palermo in the summer of 
1905.° This theory was designed to agree with Poincaré’s view of relativity, 
according to which “optical phenomena depend only on the relative motions 
of the material bodies—sources of light and optical instruments—involved”,* 
and it is, at any rate in Poincaré’s streamlined version, experimentally 
indistinguishable from and mathematically equivalent to Einstein’s elec- 
trodynamics of moving bodies. However, it gives expression to a natural 
philosophy very different from Einstein’s, for it squarely rests on the 
assumption of an aether, which acts equably on all material systems that move 
across it, causing bodies to shrink and natural clocks to go slow by the exact 
amount required to mask their movement with respect to it.° 

Poincaré was convinced that experiments can only disclose relations 
between ordinary material bodies.® Already in 1895 he had warned of the 
impossibility of measuring “the relative motion of ponderable matter with 
respect to the aether”.’ In 1900 he told the Paris International Congress of 
Physics that, “in spite of Lorentz”, he did not believe “that more precise 
observations will ever reveal anything but the relative displacements of 
material bodies”.® In September 1904, speaking in St Louis, he listed among 
the fundamental principles of classical physics—together with the two 
Principles of Thermodynamics, Newton’s Third Law of Motion, Mass 
Conservation and the Principle of Least Action— 

the Principle of Relativity, according to which the laws of physical phenomena should be the 
same for an observer at rest or for an observer carried along in uniform movement of 


translation, so that we do not and cannot have any means of determining whether we 
actually undergo a motion of this kind.? 


On time, Poincaré writes in Chapter VI of La Science et ! Hypothése: 


There is no absolute time. [ . . . ] Not only have we no direct intuition of the equality of two 
time intervals, but we do not even have a direct intuition of the simultaneity of two events 
occurring at different places.'° 
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Hence, simultaneity and time order must be constructed according to agreed 
rules. Poincaré (1898) discusses several possibilities. He observes, in particular, 
that astronomers who measure the speed of light 


begin by admitting that light has a constant speed, and, in particular, that its speed is the same 
in every direction. This is a postulate without which no measurement of that speed can be 
attempted. This postulate cannot be verified by experience, though it could be contradicted 
by it if the results of different measurements were not consistent. [ ... ] The postulate 
[ ... ] has been accepted by everybody. What I wish to retain is that it furnishes a new rule 
for the determination of simultaneity.!! 


There is no doubt that Einstein could have drawn inspiration and support for 
his first work on Relativity from the writings of Poincaré. 

We saw in Section 2.2 how Lorentz had proposed different explanations for 
the null results of the first and second order experiments aimed at detecting the 
relative motion of the earth and the aether. At the 1900 Paris Congress 
Poincaré spoke disparagingly of the ease with which a hypothesis could always 
be found, to account for the mutual compensation of the terms of each given 
order, Isn’t it wonderful that a particular circumstance is at work just where it 
is needed to avoid the detection of the earth’s motion by effects of the first 
order, and that another, altogether different, but no less fitting circumstance, 
precludes its detection by effects of the second order? “No [said Poincaré]; one 
must give the same explanation for the former as for the latter; and everything 
leads us to think that this explanation will be equally valid for all terms of 
higher order, and that the mutual suppression of such terms will be rigorous 
and absolute.”!? Lorentz (1904) acknowledges that 


this course of inventing special hypotheses for each new experimental result is somewhat 
artificial. It would be more satisfactory if it were possible to show by means of certain 
fundamental assumptions and without neglecting terms of one order of magnitude or 
anotnes) saat many electromagnetic actions are entirely independent of the motion of the 
system. 


Lorentz proposes to do this now, for systems moving with a velocity less than 
that of light. He considers a material system %, travelling across the aether with 
speed v < c. X is given a Galilei coordinate system (t, x, y, z), with the positive 
x-axis pointing in the direction of motion, and an auxiliary coordinate system 
(t’, x’, y’, 2’), related to the former by the equations 


= Bp" 'lt— Bloc” ?x 


(3.8.1) 
x’ = Blx ye=ly 2’ = lz, 


Here 8 = |(1 —v?/c?)~'/?|, and / is a “function of v whose value is 1 for v = 0, 
and which, for small values of v, differs from unity no more than by a quantity 
of the second order”.!* By a laborious argument, in the course of which several 
hypotheses are introduced, Lorentz is able to show that / must be constant, and 
hence equal to 1.!° A short calculation will show that, if | = 1, the primed 
system is related to the Galilei coordinate system for the aether rest frame that 
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agrees with the unprimed system at time rt = 0, by a pk Lorentz transformation 
with velocity parameters (v,0,0), such as the one given by (3.4.18). If the 
electric and magnetic forces transform according to the equations set forth by 
Lorentz at the outset, it follows that it is “impossible to detect an influence of 
the Earth’s motion on any optical experiment, made witha terrestrial source of 
light, in which the geometrical distribution of light and darkness is observed” 
or “in which the intensities in adjacent parts of the field of view are 
compared”. '© However, due to a misapprehension of the transformation rules 
applicable to the electric charge and current densities, Lorentz’s new theory 
does not comply perfectly with the RP, and experiments can still be conceived, 
involving speeds not much less than c, which, if the theory were correct, would 
reveal the motion of a material system relative to the aether.!” 

Lorentz’s theory of 1904 continues the line of thought pursued by him in the 
nineties, which it rectifies slightly yet significantly, while maintaining the same 
basic outlook and methodology.'® It is hard for me to understand that anyone 
should have ascribed to Lorentz the authorship of Special Relativity. He, at 
any rate, never claimed it. In his Columbia lectures of 1906 he unhesitatingly 
attributes “the principle of relativity” to Einstein,!° to whom he gives credit 
“for making us see in the negative result of experiments like those of 
Michelson, Rayleigh and Brace, not a fortuitous compensation of opposing 
effects, but the manifestation of a general and fundamental principle”.”° Yet he 
feels that something must be said also for his own approach, for he “cannot but 
regard the ether, which can be the seat of an electromagnetic field with its 
energy and vibrations, as endowed with a certain degree of substantiality, 
however different it may be from all ordinary matter”. From this point of view, 
it seems preferable not to assume, right from the beginning, that “it can never 
make any difference whether a body moves through the ether or not”, and to 
choose rods and clocks at rest in the aether as the standards of length and 
time.’ In a note added to the 1916 edition of the same book he praises 
“Einstein’s theory of relativity” for the simplicity of its treatment of 
electromagnetic phenomena in moving systems, a simplicity which he, 
Lorentz, had not been able to achieve. “The chief cause of my failure [he adds] 
was my clinging to the idea that the variable t only can be considered as the true 
time and that my local time t’ must be regarded as no more than an auxiliary 
mathematical quantity.”?? 

In opposition to Lorentz’s cherished opinion, Poincaré (1906) does assume 
from the outset that “‘the Relativity Postulate (Postulat de Relativité)’— 
according to which “the absolute motion of the Earth, or rather its motion 
relative to the aether instead of relative to the other celestial bodies” cannot be 
demonstrated by experiment—holds good “without restriction”. Even if 
experience, which so far has corroborated the postulate, may well run counter 
to it in the future, “it is, in any case, of interest to see what consequences follow 
from it”.?> Poincaré points out that, while the Fitzgerald—Lorentz contraction 


RAG - D 


86 RELATIVITY AND GEOMETRY 


hypothesis accounts for the null result of Michelson’s experiment, “it would 
nevertheless be inadequate if the relativity postulate were valid in its most 
general form”. However, Lorentz (1904) has succeeded in extending and 
modifying that hypothesis “so as to bring it into perfect agreement” with the 
Relativity Postulate.?* Poincaré’s own dynamics of the electron agrees with 
Lorentz’s theory “in all important points”, though it “modifies them and 
completes them in certain details”.?> This preamble suggests that Poincaré in 
any case embraced the main tenets of Lorentz’s natural philosophy; namely, 
that the aether exists and is the seat of the fields of force governed by the 
Maxwell equations, that uniform motion across the aether subjects material 
bodies and systems to strains and stresses that change their shape and delay 
their internal processes, that real lengths and times can therefore be measured 
only by rods and clocks at rest in the aether. This suggestion is borne out by 
several passages in the text,?° which, on the other hand, bears no trace of 
Poincaré’s having ever countenanced a revision of the fundamental concepts 
of classical kinematics. Poincaré’s electrodynamics of moving bodies definitely 
does not rest, like Einstein’s on a modification of the notions of space and time. 
For this reason, it does not attain to Special Relativity’s universal scope. And 
yet, in his (1906), Poincaré introduced several ideas of great importance for a 
correct grasp of the geometrical import of Einstein’s theory. He chose the units 
of length and time so as to make = 1,a practice that throws much light on the 
symmetries of relativistic spacetime.2” He proved that Lorentz transform- 
ations form a Lie group and investigated its Lie algebra.?* He characterized 
the (homogeneous) Lorentz group as the linear group of transformations of 
R* which leave the quadratic form x? + y? +z? —1? invariant.?° He further 
noted that, if one substitutes the complex valued function it for t, so that 
(ity, Xqs Var Zq) are the coordinates of three points P,(a = 1,2,3) “in four- 
dimensional space, we see that the [homogeneous] Lorentz transformation is 
simply a rotation of this space about a fixed origin”.*° He discovered that some 
physically significant scalars and vectors——e.g. electric charge and current 
density—can be combined into Lorentz invariant four-component entities 
(subsequently called “four-vectors”), thus paving the-way for the now familiar 
four-dimensional formulation of Relativity physics.?! 

Perhaps one ought to take Poincaré’s acceptance of the aether with a grain 
of salt. He had written in La Science et [ Hipothese: 


We do not care whether the aether really exists; that is a question for metaphysicians to deal 
with. For us the essential thing is that everything happens as if it existed and that this 
hypothesis is convenient [commode ] for the explanation of phenomena. After all, have we 
any better reason to believe in the existence of material objects? That again is merely a 
convenient hypothesis, though one which will never cease to be so, while a day will doubtless 
come when the aether will be discarded as useless.>? 


His nonchalance on the matter of physical existence may well have moved him 
to speak in 1906 “as if” Lorentz’s aether were real, given that Lorentz’s 
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formulae yielded satisfactory predictions. Anyhow, it is surprising that 
Poincaré, who was so well equipped to discover the chronogeometric key to 
Special Relativity, should have failed to do so. A. I. Miller blames Poincaré’s 
inductivist philosophy, which led him to treat the Relativity Principle as a 
contingent generalization from experience, to be ultimately explained by the 
laws of electrodynamics.*> But even such a staunch anti-inductivist as Einstein 
would—one hopes—have rejected the RP, had experience been notoriously 
hard to reconcile with it. And although the use of the RP as an axiom in 
Einstein (1905d) precludes its derivation from deeper principles, one cannot 
say that in the more mature formulations of Special Relativity it is left 
unexplained, for, as we shall see in Chapter 4, it is a natural and obvious 
consequence of the structure of Minkowski spacetime. I am inclined therefore 
to attribute Poincaré’s failure to another aspect of his philosophy, namely, his 
conventionalism.>* Persuaded as he was that physical geometry and chronom- 
etry are freely chosen for reasons of convenience and are wholly indifferent to 
the true nature of things, he could not expect to gain any new knowledge by 
tampering with Newtonian time or Euclidean space. This particular attitude to 
the theory of space and time can also help explain why Poincaré did not greet 
Einstein’s paper as the splendid achievement that it was, a decisive 
breakthrough in the articulation of insights that Poincaré himself had been the 
first to suggest.25 Though he recommended Einstein for a chair at the Swiss 
Federal Polytechnic as “one of the most original thinkers I have ever met”,*° 
there is no evidence that he ever said a word in praise of Einstein’s work on 
Relativity, and he insistently referred to the RP as “le principe de Relativité de 
Lorentz” years after Lorentz himself had called it, in the 1908 preface to his 
Columbia lectures, “Einstein’s Principle of Relativity”.*>” However, there could 
be another, at least partial explanation for Poincaré’s silence: it may not have 
been easy, even for a man of his stature, to own that he had lost the glory of 
founding 20th-century physics to a young Swiss patent clerk. 


CHAPTER 4 


Minkowski Spacetime 


4.1 The Geometry of the Lorentz Group 


Hermann Minkowski, professor of mathematics at Gottingen, made several 
important contributions to number theory and other branches of pure 
mathematics.’ However, he is chiefly remembered for having articulated the 
new theory of space and time that is the core of Special Relativity.2 Minkowski 
conducted with Hilbert, Wiechert and Herglotz the 1905 Géttingen seminar 
on electron theory, based essentially on the work of Lorentz.> On November 
Sth, 1907, he gave a lecture at the Gottingen Mathematical Society about “The 
Relativity Principle”.* He introduced the subject as “a complete change in our 
ideas of space and time”, prompted by the electromagnetic theory of light.° 
Mathematicians, he noted, are particularly well prepared to understand the 
new ideas, which merely specify concepts they have been long familiar with, 
but “the physicists must now to some extent invent these concepts anew, 
laboriously carving a path for themselves across a jungle of obscurities, while 
very close by the mathematicians’ highway, excellently laid out long ago, 
comfortably leads onwards.”® A month and a half later he submitted to the 
Gottingen academy a long paper on “The fundamental equations of elec- 
tromagnetic processes in moving bodies”, which gives evidence of the clarity 
and beauty that accrue to classical electrodynamics through the new ideas of 
time and space.’ Finally, on September 21st, 1908, he displayed the new 
natural philosophy in all its glory before the 80th Congress of German Natural 
Scientists and Physicians in his lecture, “Space and Time™.® 

Minkowski put his reading of Relativity in a nutshell thus: “The world in 
space and time Is in a certain sense a four-dimensional non-Euclidean 
manifold.”? On page 22 we observed that, by their handling of space and time 
coordinates, the founding fathers of mathematical physics had in effect 
conceived the changing natural world as a 4-manifold. But Special Relativity 
goes still further, for, according to Minkowski, it implicitly endows the cosmic 
4-manifold with a geometric structure, no less rich than, though distinct from, 
that of Euclidean 4-space. This is the structure that Minkowski, in the above 
text, calls non-Euclidean and that I shall now describe. 

Before tackling this task I must introduce three conventions which we shall 
follow from now on, unless otherwise noted: (1) the units of length and time are 
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to be so chosen that c = 1; (ti)in each term of a polynomial expression, 
summation is implied over the entire range of every pair of repeated indices, 
Latin indices ranging from 0 to 3 and Greek indices from 1 to 3 (Einstein 
summation convention); (iii) q,;1s equal to Oif i # j,to lifi=j =0,andto —! 
otherwise. Hence, by (ii) and (iii), 


ny xixd = (x°P = (x1? = (x7? — 08 P (4.1.1) 


We shall continue to regard Lorentz coordinate transformations as 
permutations of R*. Consequently a Lorentz chart is a bijective mapping of its 
domain onto R*, which assigns a real number quadruple to each “point at an 
instant”, i.e. to each location available for instantaneous punctual events. 
“Points at an instant” are called “worldpoints” by Minkowski.!° Let x be a 
Lorentz chart meeting the conditions stated in the last sentence of Section 3.4 
(p. 66), and let A, denote the set of all Lorentz charts related to x by a 
transformation of the proper orthochronous Lorentz group /. Since all such 
transformations are C”-differentiable, A y bestows on the set of worldpoints 
the structure of a 4-manifold (cf. p. 258). Minkowski calls this manifold, “the 
world”, but to avoid misunderstandings I shall call it Minkowski’s world. 
Minkowski’s world has the same differentiable structure as the 4-manifold .@ 
underlying our construction of a Newtonian spacetime in Section 1.6, and will 
therefore be designated by .%. 

If a = (a°, a', a’, a?) ranges over real number quadruples, the real valued 
function f,:a++n,,a'a’ is a quadratic form on the vector space R*.'! If a is 
mapped on b by a transformation of the homogeneous group &), then 
fi (@) =f, (0). f, is therefore #-invariant.'? Indeed, the full homogeneous 
Lorentz group L(4)—1.e. the group generated by “9, plus time reversal and 
parity—is the largest group of permutations of R* that preserves f,.'* Since 
translations preserve coordinate differences, the function f,:R* x R*>R: 
(a, b)-+n,,(a' — b')(a’ — b’) is Y-invariant. We call f, the Minkowski interval. 
The Minkowski interval determines a geometry on R* much in the same way 
that Euclidean distance determines Euclidean geometry. The invariance of 
y,,a' a’ under the transformations of the homogeneous Lorentz group L(4) 
captures the essence of the former, just as the invariance of the Pythagorean 
quadratic form 6,,a%a* under the orthogonal group O (3) captures that of the 
latter. The obvious analogy between both characterizations indicates that 
these are geometries of comparable richness. The fact that d,, a*a’ is always 
non-negative, and zero only if a = 0, while n,,a'a’ can be positive, negative or 
even zero for non-zero a, implies that the geometry determined by the Lorentz 
group is in effect non-Euclidean. 

The action of ¥ on R* asa group of coordinate transformations is not too 
hard to visualize. A proper orthochronous Lorentz transformation is the 
product of a translation t and a homogeneous transformation gé py. The 
effect of t is very simple: t maps each real number quadruple (t, x, y, z) on the 
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quadruple (t + a,x +, y+c,z+d), where (a, b,c, d) is a fixed element of R*, 
characteristic of t. As a result of this, every “point” in R* is “transported” a 
fixed (Euclidean) distance tn a fixed direction. The homogeneous transform- 
ation o is, as we know, equal to a pk transformation @, of the form (3.4.18), 
preceded and followed by rotations of the form (3.4.1), with k; =0 and 
det[a,,] = 1. Let p denote any of these rotations. p maps the linear sub- 
space {(t, 0,0, O)le ER} identically onto itself. On each hyperplane 
((k, x,y, 2)|X, 3, 2 Rik = const. |, pacts like a rotation through a fixed angle 
0, characteristic of p, about an axis containing the points (k, 0, 0,0) and 
(A,r,,12,13), Where r= (r,,r2,r3) is an eigenvector of the matrix (yg) 
(in (3.4.1)).4 It only remains to visualize the effect of the pk transformation @, 
with velocity parameters (v, 0, 0). In view of (3.4.18), it is clear that on the plane 
{(0,0, y, z)|y,z ER} @ is the identity, whereas its restriction to each plane 
((t, x, ky, k2)|t, x € Rik,, ky const. isa permutation of this plane. g also maps 
each hyperboloid 1? — x? — y? —z* = const. onto itself. Let us consider the 
effect of g on the plane { (1, x, 0, 0)/t, x € R}. This plane intersects each of the 
aforesaid hyperboloids on the hyperbola 1? — x? = const., which @ maps onto 
itself. p rotates the axes T = {(t,0, 0; 0)|t¢€R} and X = {(0, x,0,0)|xeR} 
about the origin (0, 0, 0,0) towards one of the two asymptotes 17 —x? = 0, 
through the angles —@ and @, respectively, where @ = arc tan v. Since 
v<1, -—7/4<0<127/4. Since o is linear, the entire hyperplane 
H = {(0,x,y,z)|x,y,zER} is rotated by g about the fixed plane 
{(0, 0, y, z)|y,zER}. Let <, > denote the inner product on R* obtained 
by polarizing the quadratic form f, (see (4.2.1)). If t = (t,0,0,0) and 
h = (0,x, y,z) are arbitrary points of T and H, evidently (1,h> = 0. The &o- 
invariance of f; implies that <@(t),p(h)> = 0, as well. We observe finally that, 
as a linear permutation of R*, @ preserves parallelism (in the Euclidean sense). 

Let x € Ay. We define the interval F (P, Q) between two points P and Q of 
Minkowski’s world .4 by 


F(P, Q) = fa(x(P), x(Q)) 
= ni (x (P) — x'(Q)x(P) — x(Q)) 


Clearly, F does not depend on the choice of x and is invariant under the action 
of Y on.@ asa group of Lorentz point transformations.'> By this action, .@ is 
endowed witha geometry that is fully characterized by the #-invariance of the 
interval function F and is structurally identical with the geometry defined on 
R* by the action of ¥ as a group of coordinate transformations. Minkowski 
spacetime is Minkowski’s world, with this geometry. The comparative ease 
with which it can be defined should not blind us to its physical significance, 
which is solidly grounded on the physical hypotheses that go into the 
definition of Lorentz charts. If, as it turns out, those hypotheses are true only 
locally and approximately, the physical significance of Minkowski spacetime is 
correspondingly limited, but by no means suppressed. 


(4.1.2) 
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4.2 Minkowski Spacetime as an Affine Metric Space and 
as a Riemannian Manifold 


The action of the Lorentz group ¥ on R* illuminates the structure of 
Minkowski spacetime just as the action of the Galilei group on R* disclosed 
us that of Newtonian spacetime (p. 27). Here, however, we shall approach 
Minkowski spacetime from a different, or rather from two different points of 
view, that will be useful in the sequel. In the first place, we shall consider it as an 
affine metric space. This approach is both elegant and natural, and gives an 
intuitive grasp of its geometry. In the second place, we shall conceive it as a 
Riemannian manifold. This view of it may seem unnecessarily complex, but it 
underlies Minkowski’s formulation of relativistic mechanics and electro- 
dynamics, and the subsequent development of Genera! Relativity. 

An affine space consists of a point set, a vector space, and a transitive and 
effective action of the additive group of the latter on the former. Let H be a 
mapping of R* x .@ into .M (where .@ designates just the set of worldpoints, 
and we forget its differentiable structure for the time being). Assume that (i) for 
any a, be R‘andany Pe.@, H(at+b, P) = H(a,H(b, P)); (ii)forany P,Q e.a 
there is some a€ R* such that H (a, P) = Q; (iii) H (a, P) = 0 forevery Pe.“ if 
and only if a=0. <.@,R*,H) is then an affine space. We shall now 
characterize some subsets of .@ which we call flats. P and Q denote points of 
Mu, v and w are linearly independent vectors in R*; instead of H (u, P), we 
write uP. The 1-flat or straight through P generated by u (under the action H) 
is the set of all points Q in 4 such that, for some real number k, Q = kuP. The 
2-flat or plane through P generated by u and v is the union of all straights 
through P generated by k, u + kv, for any real numbers k, and kz. The 3-flat or 
hyperplane through P generated by u, v and w is the union of all straights 
through P generated by k,u+k,v+k,w, for any real numbers k, (a = 1, 2, 3). 
Let us say that two n-flats are parallel if they are generated by the same set 
of n linearly independent vectors. The reader should satisfy himself that 
parallel n-flats either coincide or have no points in common. Hence, the choice 
of n linearly independent vectors in R* (n < 4) induces a partition of .@ into 
parallel n-flats. It is clear, moreover, that if 4 is a straight and Q a point of ./@ 
outside A there is one and only one straight through Q parallel to 4, namely, 
that generated by the same vector that generates 4. <.#, R*, H > therefore 
satisfies, like Newtonian spacetime, the Euclidean parallel postulate. 

An affine space is an affine metric space if an inner product is defined on the 
vector space. In our case, we define the Minkowski inner product on R* by 
polarizing the quadratic form f, of Section 4.1." If v and w are two vectors of 
R‘ their inner product is therefore 


Cv, w > = niyviw! (4.2.1) 


If <v,w > = 0, vand ware orthogonal. Two concurrent straights are orthogonal 
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if they are generated by orthogonal vectors. A straight is orthogonal to a plane 
(or hyperplane) it meets if the vector that generates the former is orthogonal to 
those that generate the latter. Since our affine metric space includes some self- 
orthogonal straights, it is not metrically Euclidean. 

We shall now define the interval | P — Q| between two points P and Q in.@. 
Since H is transitive and effective, there is one and only one vector v in R* such 
that OQ = vP. We set 


|P—vP| = <v,v) (4.2.2) 


The vector v and the interval | P — vP| are said to be timelike, spacelike, or null if 
< v,v > is, respectively, positive, negative, or zero. A null or timelike vector v is 
said to be future-directed if its first component v° is greater than zero. The set of 
null vectors is a hypercone in R*, which we call the null cone.* The set of 
worldpoints Q such that Q = vP fora fixed Pe.@ and any null vector vis called 
the null cone at P. 

The affine metric space «.@,R*, H, f, > is essentially equal to Minkowski 
spacetime, as defined in Section 4.1, if the action H meets the physical 
conditions we shall now state. Let P and Q be two worldpoints such that 
Q = uP. Then, (a) if v is a future-directed null vector, P could be the emission 
and Q the reception of a light signal travelling in vacuo between these two 
events; (b) if v is a timelike vector such that < v,v> = 1, P and Q could be two 
ticks, separated by a unit of time, of a natural clock at rest in an inertial frame; 
(c)if <v,v > = Land R = uP = wQ, where wis null and vand ware orthogonal, 
there could be an unstressed rod of unit length at rest in an inertial frame, such 
that P and Q occur at one of its endpoints while R occurs at the other. Any 
mapping of R* x .@ into .@ which satisfies the three mathematical conditions 
(i}-(ii1) on page 91 and the three physical conditions (a}-(c) listed above will be 
called an admissible action. If H is an admissible action, the mapping v 4 vP of 
R* onto .@ is the inverse of a Lorentz chart with its origin at P.2 (Thus ./@ 
recovers through an admissible action the differentiable structure that we 
chose to forget on page 91). If ¢ designates the vector (1,0,0,0), the parallel 
straights generated by t under the action H agree with the worldlines of 
particles at rest on a particular inertial frame F (cf. page 29). Let y be any 
proper homogeneous Lorentz coordinate transformation. The mapping Hy, 
defined, for each ve R* and Pe.@, by 


Ho(v, P) = H(p(v), P) (4.2.3) 


is also an admissible action. (H, fulfils the mathematical conditions, for ¢ is 
linear; it also fulfils the physical conditions, for y preserves the inner product 
on R*). Again, the parallel straights generated by t under H, agree with the 
worldlines of particles at rest on an inertial frame F,. We observe that F 
differs from F if and only if gy is a pk Lorentz transformation. On the other 
hand, if G is any inertial frame, there is certainly a y € “po such that the parallel 
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straights generated by y(t) under H, and hence by t under H,, agree with the 
worldlines of particles at rest in G. The mapping H gt? F, from admissible 
actions to inertial frames is therefore surjective. Let # be the set of admissible 
actions. The mapping L: 2o x # ~ #H defined by L(y, H) = Hy is an action 
of £, on # .* Since L is transitive, under it all elements of ¥ are in a sense 
equivalent: Under this action of £,, # is a homogeneous space (p. 72). If the 
stage of physical events actually has the structure of Minkowski spacetime, the 
equivalence of all inertial frames, proclaimed by the Relativity Principle, 
follows at once from the equivalence of all points in the homogeneous space 
(KH, Lo). The arbitrary choice of a particular action He # for our 
definition of Minkowski spacetime as an affine metric space mirrors the choice 
of the “stationary” frame and the directions of three orthogonal spatial axes in 
Einstein’s original presentation of Special Relativity. 

Before introducing our second view of Minkowski spacetime let us recall the 
main features of Riemannian manifolds. To each point P in an n-manifold S 
there is attached a vector space, the tangent space S, (p. 260). A Riemannian 
metric on S is a rule that assigns to each P € S an inner product pp on Sp and 
satisfies the following requirement of smoothness: if X and Y are any two 
vector fields on S, and Xp, Yp are their respective values at P, the mapping 
P++yp(X p, Yp) is a scalar field, hereafter denoted by w(X,Y) (see p. 258). 
The pair ¢ S, # > is a Riemannian n-manifold. Since pp is a symmetric bilinear 
function, it depends entirely on its value at the (n/2)(n+ 1) distinct (un- 
ordered) pairs of elements of a basis of Sp. Due to the smoothness of p, the 
signature of wp (defined in note 1) cannot change as P ranges over any 
connected component of S. In particular, if S itself is connected (as indeed it 
must be if it isa spacetime manifold) up has the same signature for all P € S, and 
we may therefore speak of the signature o (mu) of the Riemannian metric won S. 
# is positive definite if o(u) =n, negative definite if o(u) = —n. Otherwise 
—n <o(p) <n, and pis said to be indefinite. An indefinite Riemannian metric 
of signature 2—n (on an n-manifold) is called a Lorentz metric (cf. page 280). 

An n-manifold S is parallelizable if there exists a family of n vector fields 
V',..., V", defined on all S, such that their values at each P € S forma basis of 
the tangent space Sp. If U < S is the domain of a chart u of S, U is always a 
parallelizable manifold, for u defines n vector fields ¢/éu’ on U(1 Sq Sn), 
such that the value 


6 
out 


é 


(@) — 
Pp eu 


at a point Pe U is the tangent vector to the q-th parametric line of u through P, 


and the set 
é ra) 
éu! P 


Po au" 


94 RELATIVITY AND GEOMETRY 


is a basis of S, (p. 259). Hence, a Riemannian metric p on S is completely 
determined on the domain of the chart u by the (n/2)(n+ 1) scalar fields 
u(6/dui, 0/éu") generally known as the components of p relative to 
u(1 <q <r <n). Obviously S itself is parallelizable if it admits a global chart. 
Let » be a Riemannian metric on the n-manifold S. If Pe S and v and w belong 
to Sp, we write < v, w >p for xp (v, w); v and w are orthogonal if < v,w >p = 0. 
We also write E(v) for <v,v >p. E(v) is called the “energy” of the vector v.° 
With a view to relativistic applications we say that v is timelike if E(v) > 0, 
spacelike if E(v) < 0,and null if E (v) = 0. (A non-zero null vector is sometimes 
called lightlike.) On the analogy of Euclidean geometry we define the length | v| 
ofa vector vas ,/|E(v)|. A curve y isa smooth or piecewise smooth mapping of 
an interval J < R into S (p. 259). y is said to join any pair of points in its range. 
As on p. 261, the tangent to y at the point y(t) = P(t J) will be denoted by y,, 
or by jp. 7 is spacelike (timelike, null) if, for all t€] such that y,, # 0, y,, 1s 
spacelike (timelike, null). Two curves y and y’ whose ranges intersect at a point 
P are orthogonal at P if they have mutually orthogonal non-zero tangent 
vectors P. This definition can be easily extended to surfaces, etc. If J is the 
closed interval [p,q], y(p) and y(q) are the endpoints of the curve y, which is said 
to go from the former to the latter. The integral 


q 
E(y)= | E (v4) dt (4.2.4) 
Pp 

is called the “energy” of y. y is said to be energy-critical if it is a critical point of 
the function E as it ranges over all parametrized curves defined on the same 
closed interval and having the same endpoints as y.° If J is an arbitrary real 
interval, the curve y: J > S is said to be Riemannian geodesic of <S, p > if it is 
energy-critical on every closed subinterval of I. If y:[ p,q] — S is spacelike or 
timelike we define the length of y as: 


q 
LQ) = | | Yt] dt (4.2.5) 
P 


Note that this concept of length only allows for comparison between curves of 
the same type (spacelike or timelike), and does not apply to null curves. The 
main significance of the length L(y) lies in the fact that it is invariant under 
reparametrizations; in other words, if fis any homeomorphism of R onto itself, 
L(y-f) = L(y). The length of a parametrized curve may therefore be regarded 
as a property of the path that is its range. If |),,| is constant, equal'to 1, the 
length of y’s domain (in R) and each of its subintervals is equal to the length of 
the corresponding image (in S), and y is naturally said to be “parametrized by 
arc length”. If the metric # is either positive or negative definite, <S, > is said 
to be a proper Riemannian manifold. In such a manifold all curves are timelike 
(if 4 is positive definite) or spacelike (if is negative definite) and one may 


MINKOWSKI SPACETIME 95 


speak of a length-critical curve, viz. a curve that is a critical point of the 
function L on the set of all curves defined on a given interval and sharing a 
given pair of endpoints. It can be shown that, if y isa Riemannian geodesic —as 
defined above—in a proper Riemannian manifold, y is parametrized by arc 
length and is length-critical on every closed subinterval of its domain.’ If 
< S, «> isa proper Riemannian manifold, a distance function d:§ x § > Rean 
be defined as follows: for any P,Q €S,d(P, Q) is the infimum or greatest lower 
bound of the lengths of all the curves that go from P to Q.. ¢ S,d > is a metric 
space in the usual sense.® 

Now we return to the consideration of Minkowski spacetime. Let .@ be 
Minkowski's world, that is, the set of worldpoints, with the differentiable 
structure defined by the collection of Ay of Lorentz charts. Since any chart 
x€ Ay is global, .@ is parallelizable and a Riemannian metric on .# can be 
defined by specifying its values on the ten several pairs of vector fields 6/0dx', 
whose integral curves are the parametric lines of x. In particular, the 
Minkowski metric q is defined by: 


The n; ,; have here the values given under convention (iii) on page 89, and must 
be regarded as constant scalar fields on .#. The signature of 4, 2,7,;, is equal 
to —2. The Riemannian 4-manifold <.4,9 > is Minkowski spacetime. The 
name is justified insofar as the Riemannian structure <4, 4 > is determined 
by the affine metric structure ¢«.4, R*, H,f, > of pp. 91-93, and vice versa. We 
saw already on p. 92 that the admissible action H defines Lorentz charts on .@ 
whereby it bestows on the latter the same differentiable structure as A y. Let x 
be one of the Lorentz charts defined by H. The metric 9 on .@ is then 
determined by the simple requirement that, for any pair of tangent vectors v 
and w, at a worldpoint P, yp(v,w) =f, (x,pv,X,pw), where x, is the 
differential of x (p. 261). Due to the Lorentz invariance of f,, if the requirement 
is fulfilled for one such chart x, it is fulfilled for all. On the other hand, if 
<.M, 1 > is given and x is any Lorentz chart, the mapping H,:R* x “4.4, 
defined by H,(r, P)=x~' (x(P)+r), is an admissible action in the sense of 
page 92, as the reader can verify for himself. If y is another Lorentz chart such 
that y-x~' is a translation, H, = H,, but if y-x~'e Yo, Hy is another 
admissible action related to H, as H, is related to H by equation (4.2.3). The 
Riemannian and the affine metric structures that characterize Minkowski 
spacetime do not only determine one another, but are also mutually 
compatible in the following sense: A curve y in .@ is a geodesic of the said 
Riemannian structure only if its range lies entirely on a straight of the affine 
metric structure.’ If the geodesic y, with endpoints P and Q, is spacelike or 
timelike and parametrized by arc length, or if y is null, the interval |P—Q| 
defined on p. 92 is equal to the “energy” E(y). If y is null, P and Q could be, 
respectively, the emission and reception of a light signal, sent across empty 
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space. If y is spacelike, there is an inertial frame F in which P and Q are 
simultaneous, and the length L(y) is the Euclidean distance between their 
places of occurrence in F. If y is timelike there is an inertial frame F in which P 
and Q occur at the same place, and L(y) is the time elapsed between them in F. 
If m is a massive particle, arbitrarily moving in an inertial frame F, there is a 
curve yin.# parametrized by proper time, whose range is the locus of events at 
n. We call both y and its range, the worldline of x (cf. p. 23). Since 2 must move 
in F with a velocity less than that of light (p. 71), y is timelike. At each instant ¢ 
the worldline of x is tangent to a timelike geodesic y', which is the worldline of 
a point at rest in some inertial frame F,. At the instant t, 7 moves like F,. We 
therefore call F, the momentary inertial rest frame (mirf) of z at that instant. 
If, but only if, z isa free particle, F, does not change with t, and the worldline of 
m is a timelike geodesic. 
If y:[p, q] > -@ is a timelike curve, the integral 


q 
t(y) = | [Yue] de (4.2.7) 
P 
is called the proper time from event )(p) to event y(q), along the worldline y. 
This concept introduced by Minkowski is probably his most important 
contribution to physics.'° We have observed already that any time-keeping 
device or clock keeps time at a place only (e.g., at 0° Lat., 0° Long. on a 
spherical spinning Newtonian clock, or at one of the endpoints of a Langevin 
clock, etc.). The worldline of this place is the locus of successive events which 
mark and measure the time. Since clocks are massive material systems, the said 
worldline is the range of some timelike curve y. We now identify the clock with 
a particle x describing this worldline and we postulate that, if it is accurate, its 
rate agrees at each instant with that of a Langevin clock at rest on z’s mirf. This 
postulate—sometimes called the clock hypothesis''—may be viewed as a 
conventional definition of what we mean by clock accuracy, and hence by 
physical time. But Special Relativity would doubtless have been rejected or, at 
any rate, deeply modified if the clock hypothesis were not fulfilled—to a 
satisfactory approximation—by the timepieces actually used in physical 
laboratories. The clock hypothesis implies that the time measured by our clock 
between any two events P and Q is none other than the proper time along the 
clock’s worldline from P to Q. Relativistic clocks are hodometres of timelike 
worldlines. Had more attention been paid to this fact, much of the effort spent 
on the so-called Clock Paradox would have been spared. Since the integrand in 
(4.2.7) is not an exact differential, the proper time must depend on the 
integration path. Hence, it is not surprising that two accurate clocks which 
coincide at two events but travel differently between them agree on their first 

meeting but disagree on the second one (see pp. 68 f.). 
The relativistic effects discussed in Section 3.5, pages 67 to 69, can 
also be explained geometrically, as a consequence of the structure of 
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Minkowski spacetime. To show it we shall examine the image in R* of a 
neighbourhood of the origin of a Lorentz chart x, adapted toa frame X. Figure 
1 is a diagram of the (x°, x')-plane. Vertical lines are the images of the 
worldlines of points of X. Horizontal lines join the images of events 
simultaneous in X. The null geodesics described in spacetime by light signals 
through the origin are mapped by x onto the diagonal lines w, and @,. 
Consider now a Lorentz chart y adapted to a frame Y and related to x by 
equations (3.4.18). The worldline of the spatial origin of y is mapped by x onto 
the line yz. All events (on the plane considered) that are simultaneous in Y with 
the common spatio-temporal origin of both charts are mapped by x on the line 
v. The reader ought to verify that, if our diagram is endowed with the 
restriction to the (x°, x')-plane of the Minkowski inner product (4.2.1), the 
lines and v are orthogonal, no less than the horizontal and vertical axes. (Cf. 
also p. 90). Moreover, the null lines @, and w, are not only orthogonal with 
each other, but each with itself. I have drawn the hyperbolae (x°)? —(x!)? = 1 
and (x°)? —(x!)? = —1. Any spacelike (timelike) geodesic of unit length, 
drawn from the origin, on the plane represented by our diagram, has its other 
endpoint mapped by x on a point of the latter (respectively, the former) 
hyperbola. Thus if the origin is an event E (mapped by x and y on O)at one end 
of a rod of unit length at rest on frame Y, whose other end points in the 
direction of increasing y', the event that takes place at that other end, 


x9 


w 


Figure 1. 
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simultaneously with E in Y, is mapped by x on the point A, where v meets the 
right branch of the second hyperbola. The point B where the x'-axis meets the 
parallel to » through A is then the image by x of an event at the same end of the 
rod, but simultaneous with E in frame X. The distance in X between the two 
ends of our rod is the length of the spacelike geodesic mapped by x onto OB, 
which is clearly less than 1. Relativistic length contraction is thus seen to follow 
from the geometry of Minkowski spacetime.'? The same can be said of time 
dilation. Consider the event mapped by x on 7, where 4 meets the upper 
branch of the first hyperbola. In frame Y this event occurs exactly one time unit 
after the event mapped on O. However, the time elapsed between both events in 
frame X equals the length of the timelike geodesic mapped by x onto OS 
(where S is the perpendicular projection of T on the x°-axis), which is of course 
greater than 1. A process that takes place at a fixed point of Y and lasts unit 
time by Y’s clocks, takes longer than unit time by the clocks of X. The 
reciprocity of such relativistic effects can be readily verified on the diagram: a 
rod of unit length at rest in X measures less than | in Y (draw the perpendicular 
to the x!-axis at P and see where it meets v); a process that takes unit time in X 
takes longer in Y (draw the parallel to v through Q and see where it meets jp). 
The relativity of simultaneity can also be demonstrated on Fig. 1: each parallel 
to v is the image by x of a set of events simultaneous in Y; any two such events 
are mapped on different horizontal lines; hence, they are not simultaneous in 
X. Of course, the geometry of events can be read off our diagram only because 
we initially resolved to read it into it, namely, by introducing the Minkowski 
inner product in R*—and hence in the section of R* represented in Fig. 1. It is 
therefore misleading to say that, by means of such diagrams, Minkowski 
geometry treats time as if it were merely an additional dimension of space. 
Indeed Fig. 1 is useful only if one succeeds in forgetting its accepted Euclidean 
metric, and substitutes for it the non-Euclidean metric of Minkowski 
spacetime. By this act of thought every line that deviates from the vertical 
direction by less than 45° acquires the formal properties of timelike worldlines. 
One must therefore learn to “de-spatialize” some of the directions on the page 
in order to draw on it a non-spacelike section of spacetime. 


4.3, Geometrical Objects 


If the stage or “arena” of physical events is conceived as a differentiable 
manifold, physical laws can be expressed as relations between its geometrical 
objects. Some definitions will be necessary in order to show what a wealth of 
possibilities this opens. By a geometrical object of an n-manifold S we 
understand here, primarily, a cross-section of a fibre bundle over S or over an 
open submanifold of $;' secondarily, the value of such a cross-section at a 
point of S (geometrical object at a point); and, more loosely, any entity 
constructed from geometrical objects in the preceding sense, or defined 
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analogously. As explained on p. 262, each point P in an n-manifold S is 
associated wih an infinite array of vector spaces, namely, the tangent space Sp, 
its dual space, the cotangent space S# of linear functions on Sp, and the several 
spaces of multilinear functions on the product spaces formed from Spand S#, 
in every conceivable combination. An element of any of these spaces is called a 
tensor at P. A (0,q)-tensor (or covariant tensor of order qg) at P is a q-linear 
function on (Sp)*, while a (q, 0)-tensor (or contravariant tensor of order q) at P 
is a q-linear function on (S})*, and a (q, r)-tensor (mixed tensor of order g +r) 
at Pisa (q +r)-linear function on (S$)* x (Sp)’.? Evidently, a (0, 1)-tensor at P is 
an element of S#; it is also called a covariant vector or covector at P. On the 
other hand, in virtue of the canonical identification of a vector space with the 
dual of its dual, a (1, 0)-tensor at P is an element of Sp, and is better knownasa 
contravariant vector or simply as a vector at P. We agree to call any scalar (real 
number) associated with Pa (0,0)-tensor at P. A tensor ¢ of order 2 is symmetric 
if, for every pair (u,v) in its domain, t(u,v) = t(v, u); it is skew-symmetric if 
t(u, v) = —t(v, u). The aggregate of all pairs (P, tp), where Pe S and tpisa (q, r)- 
tensor at P, is made into a bundle over S. We call it the (q,r)-tensor bundle of S. 
(The tangent bundle of Sif q = 1 andr = 0; the cotangent bundle if q = Oandr 
= 1.) A (q,r)-tensor field T on S is a cross-section of the (q, r)-tensor bundle of 
S. The value T(P) of T at PeS is a pair, whose first member is P itself, and 
whose second member isa (q, r)-tensor at P, which we denote by Tp. If, for each 
PéS, Tp is an alternating r-linear function on S, (that is, a (0, r)-tensor which 
takes the value 0 whenever its argument—a list of r vectors—contains the same 
vector twice), T is said to be an r-form. (Note that, according to this definition, 
every covector field is a l-form.) The (0,0)-tensor fields or scalar fields on S 
constitute the algebra ¥ (S) defined on page 258. The (q, r)-tensor fields on S-— 
for each given pair of non-negative integers qg and r—constitute a module over 
F (S), which we denote by ¥4(S). (In the latter expression we omit the upper or 
the lower index if it happens to be zero.)* 

An ordered basis (e', . . ., e") of the vector space Sp (PS) is called an n-ad 
(repere) at P. The ordered basis (e¥, . . .. e*) of the dual space S#, defined by 
the relations e¥ (e/) = 6,;(1 < i,j <n), is its dual co-n-ad. The aggregate of all 
n-ads (co-n-ads) at every Pe S can be made into a principal fibre bundle over 
S.° A cross-section of the bundle of n-ads of S is called an n-ad field (repére 
mobile). It is obviously an ordered basis of the module ¥ '(S). A co-n-ad field is 
defined analogously and is an ordered basis of ¥’, (S). It follows from what we 
said on page 93 that an n-ad field on S exists if and only if S is paralletizable. If 
T is a (q,r)-tensor field on S (q+r 21), T is associated on each open 
submanifold UcS with a unique (q+r)-linear mapping of (¥,(U))? 
x (V1 (U)Y into ¥(U), which is also designated by T and is defined as follows: 
If V',..., VW’ are vector fields on U and V,,..., V, are covector fields on U, 
T(Visee ss Vas ... VW’) is the scalar field on U whose value at each point 
PeU is equal to Tp((V,)p,.. ., (Vq)p, Vb... Vp). If U is parallelizable and 
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X = (X!,...,X")is an n-ad field on U,and X* = (X*,..., X*)is the dual co- 
n-ad field defined by the relations X¥(X/) = 6, ;, (1 < i,j <n), the (q+r)-linear 
mapping T is fully determined by its values on the n?*’ possible lists of g 
elements of X * followed by r elements of X. These values constitute an array 
of scalar fields on U, called the components of the tensor field T relative to the n- 
ad field X. If X = (0/éx',..., 6/0x"), where the 0/dx' are the vector fields 
defined on U by a chart x, in the manner explained on page 259, we usually 
refer to the components of T relative to X as its components relative to the 
chart x.° The (g,r)-tensor field T is therefore uniquely associated, on each 
parallelizable open submanifold U c S, witha mapping T of the set Ly of n-ad 
fields on U into (F(U))""’, which assigns to each X eZy the list of T’s 
components relative to X.’ The mutual linear dependence of the n-ad fields of 
U and their dual co-n-ad fields governs the mutual dependence of the 
respective components of T, which can therefore be correctly, though 
somewhat abstrusely described as so many arrays of real-valued functions, 
which “transform into each other” according toa definite rule. If the manifold $ 
itself is parallelizable one may even identify T with the aggregate of such 
arrays, structured by the appropriate transformation rule, a view generally 
taken in the early days of the Tensor Calculus. 

Let g be a Riemannian metric on S (p. 93). The reader ought to satisfy 
himself that g is in effect a symmetric (0, 2)-tensor field on S. g determines a 
canonic linear isomorphism of ¥"! (S) onto Y; (S), which associates each vector 
field V with a unique covector field V, defined as follows: for any We ¥ '(S), 
V(W) = g(V, W). V and V are usually regarded as the contravariant and the 
covariant form of the same geometrical object. The said canonic isomorphism 
induces definite linear isomorphisms between any pair of modules of tensor 
fields of the same order, on S. Take, for example, the modules ¥ ?(S), ¥;(S)and 
¥1(S) of tensor fields of order 2, Each (2,0)-tensor field T is uniquely 
associated with a (0,2)-tensor field T and a (1, 1)-tensor field T’ defined as 
follows: for every V, We ¥'(S), T(V, W) = T’(V, W) = TCV, W). T, T and T’ 
are regarded as the contravariant, covariant and mixed forms of the same 
object. Their respective components with respect to a given chart x are 
usually denoted by T'! (= T(dx',dx/)), T,;(= T(0/6x', 0/dx’)), and 
T'; (= T'(dx', 6/éx’)), whence the habit of saying that such tensor fields are 
derived from one another by “raising or lowering indices”. 

On the domain U of a chart x the Riemannian metric g can be represented 
by the matrix of its components relative to x. These are usually denoted by 
gij(= 2(0/0x', 0/éx’)). The restriction of g to U is an element of ¥(U) 
and hence an ¥ (U)-linear combination of the tensor products dx'@ dx’ 
(1 <i,j <n), which constitute a basis of ¥,(U ). (Tensor products are defined 
on p. 349, note 15; function differentials on p. 261). It will be readily seen that, 
on U, 


g= gi, Ax' @dx! (4.3.1a) 
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In classical differential geometry a Riemannian metric g was usually expressed 
in terms of the coordinate differentials associated with a chart x as the line 
element ds, given by 


ds? = g,,dx'dx! (4.3.1b) 


For historical reasons, we shall often use the latter form of expression. But 
rather than squeeze our brains to penetrate its original meaning, we shall 
simply regard it as synonymous with (4.3.1a). 

If physical realities are represented by tensor fields on the spacetime 
manifold, physical laws must say how different fields vary together from 
worldpoint to worldpoint. This involves a comparison between the values of 
each field at neighbouring worldpoints. Such a comparison is possible in 
principle, because the several values of a tensor field belong to isomorphic 
vector spaces—namely, the fibres of the pertinent tensor bundle over the 
arguments corresponding to those values. However, since there are many 
different isomorphisms between these spaces, one must be chosen in order to 
institute a definite method of comparison. Some choices are natural. Thus, as 
we know, a vector field Ve ¥! (S) has a maximal integral curve through each 
PeS, which we shall denote here by y? (p. 260). V determines, on a 
neighbourhood Up of each Pe S,a family {F,|t 1} of regular mappings of Up 
into S (indexed by a real interval J about 0), such that F, sends each Qe U, to 
y@(t). For each re J, the differential F,, defines the linear isomorphism F,,9 of 
Sg onto S-(g,and induces other well-defined isomorphisms between the other 
vector spaces associated with Sg and the corresponding vector spaces 
associated with S; (g). The existence of these isomorphisms makes it possible to 
define the so-called Lie derivative LyT of a (q,r)-tensor field T along the vector 
field V on an arbitrary n-manifold S. (LyT is a (q,r)-tensor field like T.)® The 
Lie derivative of a vector field along another can be used for characterizing 
Cartan’s exterior derivative, a differential operator which, like the Lie 
derivative, is well defined on any n-manifold S, but acts only on a special kind 
of tensor fields, namely, r-forms. (The exterior derivative dw of an r-form w is 
an (r+1)-form.)? In Appendix C, I explain how the choice of a “linear 
connection” on a manifold S$ determines a linear isomorphism Teg of Sp onto 
Sg, for each pair of points P,Q ¢ Sand each curve y joining them, which in turn 
induces matching isomorphisms between the spaces of (q,r)-tensors at P and 
Q. (Tg is said to “parallel-transport” Sp to Sp along 7.) These isomorphisms 
enable us to define the covariant derivative VT of a (q,r)-tensor field T on the n- 
manifold S, endowed with a given linear connection. (VT is then a 
(q,r + 1)-tensor field on S.)’° There are indeed infinitely many different ways of 
bestowing a linear connection on a given n-manifold S, but if Sis Riemannian 
the metric defines a unique linear connection, called the Levi-Civita connection, 
which is in a sense a natural choice and provides a definite and powerful 
procedure for comparing the values of any given tensor field at nearby 
points.!} 
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Though the theory of tensors on manifolds was created by Ricci in the late 
19th century, the conceptual scheme I have just sketched was not developed 
until much later, after Einstein’s use of tensor methods in General Relativity 
encouraged mathematical research in the field. The first formulations of 
spacetime physics by Minkowski and others did not represent physical entities 
by tensor fields in the above sense, but by geometrical objects of a somewhat 
different sort. Minkowski defined them explicitly as objects at a point, but his 
differential equations express local relations between fields that take the 
objects defined by him as values. Minkowski called his objects spacetime 
vectors of the first and the second kind, but we shall call them, after 
A. Sommerfeld (1910), 4-vectors and 6-vectors, respectively. A 4-vector is 
defined by Minkowski as an “arbitrary array” of four real numbers 
("o.'1s12,13), “together with the rule that”, whenever a proper homogeneous 
Lorentz transformation, governed by the equations x' = a, ,}”, is applied, the 
array is to be replaced by the quadruple (rg, 1; 75,3), obtained as a solution of 
the transformation equations when the r; are substituted for the x,.'7 In the 
subsequent discussion it becomes clear that such number quadruples as- 
sociated with a transformation rule are attached to particular worldpoints. A 
4-vector field may therefore be characterized as a correspondence which 
assigns a 4-vector to each worldpoint and varies smoothly from one 
worldpoint to another. Such a characterization is achieved simply by repeating 
the definition of a 4-vector, substituting “array of four smooth real functions 
(on Minkowski’s world)” for “array of four real numbers”. Though 4-vector 
fields do not at first sight seem to have much to do with our earlier concept ofa 
vector field on a 4-manifold they can be easily constructed from the latter. We 
recall that Minkowski’s world .@ is parallelizable. Hence a vector field V on. 
can be identified with the mapping that assigns to each tetrad field X on .@ the 
components of V relative to X. Let us say that the tetrad field (6/éx°, 0/0x', 
C/éx?, @/6x3) defined on .M@ by a Lorentz chart x is a Lorentz tetrad field. It is 
not hard to see that the restriction of a vector field on .# (in the sense just 
explained) to Lorentz tetrad fields is exactly the same as a 4-vector field on 4, 
according to the above characterization. Since each vector v at a worldpoint P 
can be expressed as a linear combination v'd/@x'|p, of any given Lorentz tetrad 
at P, v can be described, correctly even if somewhat clumsily, as the set of its 
components v’', relative to an arbitrary Lorentz tetrad, plus the rule for 
calculating from them its components relative to any other Lorentz tetrad. In 
this way, any vector at a worldpoint may be seen as a 4-vector at a point. This 
approach to 4-vectors and 4-vector fields not only solves the mystery of their 
transformation rule, but also enables us to extend Minkowski’s conception, 
almost without effort, to geometrical objects of greater complexity. We simply 
define a 4-tensor field of order q as the restriction to Lorentz tetrad fields of a 
tensor field of order g on 4, conceived as on page 100, i.e. as a mapping of the 
set 2 y of tetrad fields on 4 into the appropriate Cartesian power of the ring 
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F (.M). In particular, a 6-vector field can be identified with a skew-symmetric 4- 
tensor field of order 2. (Skew-symmetry implies that it has no more than 6 
independent components relative to each Lorentz tetrad field, whence 
Sommerfeld’s name for it.)'? Again, any tensor at a worldpoint may be viewed 
as a 4-tensor at that point. Evidently all the concepts of spacetime geometry 
defined in terms of the Minkowski metric 4 and its operation on ordinary 
spacetime vectors retain their sense when we view the latter as 4-vectors. We 
may speak, therefore, of the energy, the orthogonality, etc., of 4-vectors. 
4-vectors and 6-vectors are the spacetime analogues of the polar and axial 
vectors of late 19th-century physics. The latter are geometrical objects of 
Euclidean space, but not of the ordinary kind defined above for arbitrary 
manifolds. They belong to the family of so-called Cartesian tensors, which we 
shall now characterize—as we did with 4-tensors—in terms of ordinary 
tensors. Let us say that the triad field (¢/éx'!, 6/éx?, G/dx*) determined by a 
Cartesian chart x on the Euclidean space &°, is a Cartesian triad field. Since 3 
is parallelizable we can identify tensor fields on &? with the corresponding 
mappings from triad fields to components. A Cartesian tensor field of order q 
on &° is the restriction to Cartesian triad fields of an ordinary tensor field of 
order g on &°. In particular, a field of polar vectors is a Cartesian tensor field of 
order 1, or Cartesian vector field, while a field of axial vectors is a skew- 
symmetric Cartesian tensor field of order 2. (Skew-symmetry implies that it 
has no more than three independent components relative to each Cartesian 
triad field.) Again, any tensor at a point of &° can be viewed as a Cartesian 
tensor at that point. As the covariant and contravariant forms of the 
Pythagorean metric of Euclidean space have the same array of components 
relative to each Cartesian triad field—namely the 3 x 3 matrix of constant 
scalar fields (6,,}—there is no point in distinguishing between covariant, 
contravariant or mixed Cartesian tensors or tensor fields. In physical 
applications a Cartesian tensor field is defined on the relative space S; of a 
given frame of reference F. If, as it is usually the case, the field is supposed to 
vary with time, it is, strictly speaking, a tensor field on the product manifold 
R x Sr. Note, however, that not every tensor field on this 4-dimensional 
manifold can be equated with a time-varying tensor field on S;-. A tensor field T 
of order g, on R x S¢, is, as we shall say, a time-dependent Cartesian tensor field 
on S, if and only if, for each te R, the restriction of T to the 3-dimensional 
submanifold {t! x S; isa Cartesian tensor field of order g on this submanifold. 
Thus, in particular, a vector field V on R x S; is a time-dependent Cartesian 
vector field on S; only if the value of V at an arbitrary point re S;- at an 
arbitrary time t is tangent to the hypersurface {t} x S, (p. 261). On the other 
hand, any scalar field fon R x S; is a time-dependent scalar field on S;, for it 
assigns a real number f (¢,r) to each pair (t€ R,re S;), .e. to each point of space 
and each time—being of course smooth for fixed t. By the same token, f can 
also be identified with a scalar field on Minkowski’s world. The representation 
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of a physical entity by a time-dependent Cartesian tensor field on the relative 
space S, does not depend on the choice of a particular Cartesian chart on S;, 
but is tied of course to the reference frame F. The representation of the same 
physical entity with regard to a different frame F’ will be given by a different 
Cartesian tensor field, defined on the relative space S;-. The correspondence 
between the tensorial representations of the same physical reality with respect 
to different inertial frames was established in classical physics by means of 
suitable variations of the Galilei Rule (p. 29). This rule does not follow from 
the mathematical concept of a Cartesian tensor, but must be regarded, in the 
present context, as a physical law. It was confirmed by mechanical experiments 
{involving speeds much less than c), but foundered when applied to 
electrodynamics. The problem that the Galilei Rule thus failed to solve does 
not even arise when physical entities are represented by 4-tensors and 4-tensor 
fields. In contrast with the infinitely many dynamically equivalent relative 
spaces of Newtonian physics, the physical manifold on which 4-tensors are 
defined is unique, and the representation of a physical entity by a geometrical 
object of this manifold is inherently frame-independent. The physical object 
may therefore ina sense be said to be conceived as, not just represented by such 
a geometrical object. (However, we ought not to overlook that the geometrical 
object with which a given physical object is identified in this way can only be 
determined up to a choice of units.) 

To understand how Minkowski and his successors went about conceiving 
physical realities as 4-vectors and 6-vectors we must take cognizance of one 
more geometrical fact. Consider an inertial frame F. A Lorentz chart x adapted 
to F determines the Lorentz tetrad field (@/éx') on Minkowski spacetime, of 
which we may also say that it is adapted to F. Let V be a 4-vector field. V is 
equal to V‘d/éx', where V' = Vx' (see note 6). Relatively to F, V may therefore 
be regarded as the sum of a field of timelike vectors, V°0/0x°, and a field of 
spacelike vectors V = V*0/0x*. This decomposition of V, relative to F, does 
not depend on the choice of a particular Lorentz chart adapted to F.'* At each 
worldpoint P, the local value of V is orthogonal to that of 6/éx°. This implies 
that the integral curve of V through P lies entirely on the hyperplane Fp, of all 
worldpoints simultaneous with P in F. Fp isa copy of Euclidean space and may 
be regarded as a representation of the relative space S; at the time of P (in F). 
The restriction of V to Fp can be identified, rather obviously, with a Cartesian 
vector field on Sp.'> V itself can therefore be regarded as a time-dependent 
Cartesian vector field on Sr. On the other hand, V°d/éx° can be identified with 
the scalar field V°, which may also be viewed as a time-dependent field on S-. 
We call V and V°, thus understood, the spacelike and the timelike parts of the 
4-vector field V, relative to the frame F. 

To illustrate these ideas we shall now define the spacetime counterparts of 
the basic concepts of classical kinematics. The motion of a particle z in an 
inertial frame F is represented classically by a curve 7, parametrized by time ¢, 
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in the relative space S;. assigns to each real number t in its domain the point 

of S; where 7 is located at time t. The velocity of x at time t is the Cartesian 

vector u= 7,,, tangent to } at 7(t). Its three components relative to the 
Cartesian chart y (on S;-) are given by 

dy" 

Ma at 


(4.3.2) 


Spacetime geometry enables us to represent the motion of z, without referring 
it to a particular frame, by a curve y, parametrized by proper time t, in 
Minkowski spacetime. y(t) is the event on z’s history at proper time t 
(measured along z’s worldline). The 4-velocity of zat tis the 4-vector U = y,,, 
tangent to y at y(t). Its components relative to the Lorentz chart y, adapted to F 
and associated with y in the manner described on p. 56, are given by!® 


Us= (4.3.3) 


Relativity was born out of the realization that (4.3.2) is meaningless until we 
make up our minds as to what we understand by the time t. If we assume that t 
is time measured by natural clocks at rest in F and synchronized by Einstein’s 
method (p. 53) along the path of 7, we have that dt = dy®. By the clock 
hypothesis (p. 96), the proper time t agrees at each moment with the time 
measured by natural clocks on the mirf of z. Consequently, if x is a Lorentz 
chart matching y and adapted to z’s mirf at a given moment, then, at that 
moment, dt = dx°. Hence, by (3.5.2), 


dt = B,dt (4.3.4) 
(where u = |u| is the speed of x in F at the moment in question, and 
B, = 1/./1—u7). It follows that 
dy° dt dy? dt 
y ued 


pel cena 4.3. 
dedc*™ di da 7 Puls (27>) 


The spacelike part of the 4-velocity U of x, relative to the inertial frame F, is 
therefore given by 


U=B,u (4.3.6) 


while its timelike part U° equals the “Lorentz factor” f,. In particular, if F 
happens to be the mirf of xz, U = Oand U® = 1. Hence the 4-velocity U of any 
particle has constant “energy” E(U) = 1.'’ Following the analogy of the 
classical acceleration a, whose components relative to y are a, = du,/dt, we 
define the 4-acceleration A, whose components relative to y are 


Ai = dUi/dt (4.3.7) 
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A is always orthogonal to U, for (U,A)> = 9, ,U'dU4/dt = 3d(n, ,U'U4)/dt 
= 3d E(U)/dt = 0.'8 

The representation of physical realities by 4-tensor fields, instead of 
ordinary tensors, on Minkowski spacetime <.#,4 > has one considerable 
practical advantage. If Tj is a component of a (q,r)-tensor field T relative to a 
Lorentz chart x (where a and b stand for appropriate lists of indices), the 
respective component relative to x of the covariant derivative of T along 0/éx' 
is simply equal to 0T?/0x'. Therefore, if T is restricted to Lorentz tetrad fields, 
one may describe its evolution from worldpoint to worldpoint with the 
familiar resources of the differential calculus. However this advantage can only 
be achieved at a price. By definition a 4-tensor field only takes components on 
Lorentz tetrad fields, defined by Lorentz charts adapted to inertial frames. 
Hence, if a physical entity is equated with a 4-tensor field one cannot 
meaningfully assign it components relative to a non-Lorentz tetrad field, such 
as could be ascertained by measurements performed on a non-inertial frame. 
The physical entity in question lacks therefore all representation and is as good 
as non-existent with respect to such a frame. To a philosopher such a price 
must seem enormous, for, strictly speaking, no terrestrial laboratory is inertial. 
But the working physicist may well ignore it, just like he ordinarily ignores—or 
corrects for—the action of extraneous forces on his instruments. As traditional 
methods of measurement tend to rely heavily on the use of unstressed rods, the 
results obtained by them only make sense in the ideal limit in which the 
relevant pieces of the experimental set-up can be regarded as force-free. Thus 
the use of 4-tensor fields instead of ordinary tensor fields for the representation 
of physical entities may not be such a big handicap after all. Nevertheless, as 
soon as one has learnt to understand a 4-tensor field as the restriction ofa more 
natural mapping to a choice part of its domain, one can easily avoid the 
drawbacks of such a restriction without giving up its advantages. For a 4- 
tensor field can be extended to an ordinary tensor field in only one way. (If we 
are given the components of a (q, r)-tensor field T relative to any tetrad field on 
&—such as the one defined by a particular Lorentz chart—its components 
relative to any other such tetrad field are completely determined by the 
appropriate transformation rule.) Hence there is no difficulty in equating any 
physical entity suitably represented by a 4-tensor field on .4 with the latter’s 
unique extension to an ordinary tensor field on .#.'? However the procedure 
has been questioned, out of what I can only describe as inductivist 
squeamishness, by some philosophically minded scientists. As far as I can 
understand them, their argument appears to be the following: experimental 
data which, when referred to different inertial frames, transform like the 
components of a 4-tensor relative to Lorentz charts adapted to those frames, 
need not, if referred to arbitrary frames, transform like the components of an 
ordinary tensor relative to suitable charts; hence, one may not conclude that a 
physical entity which is adequately represented by a 4-tensor field T is also 
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represented adequately by T’s extension to an ordinary tensor field, unless one 
has experimental proof of it.2° But even if one may not conclude it, one can 
reasonably conjecture it, and try and see if experience will prove one wrong. 
Moreover, if an experimental refutation of the proposed extension of 4-tensor 
fields to ordinary tensor fields on .@ were ever to be found, it would cause the 
downfall of the entire theoretical outlook on which the representation of 
physical entities by 4-tensor fields is based. Suppose, for example, that a well- 
corroborated physical law equates two 4-covector fields on.#, w, and w,. This 
means that, for each Lorentz chart x, w, (6/0x') = w,(0/éx'). Let @, be the 
ordinary covector field on .# of which w, is the restriction to Lorentz tetrad 
fields (r = 1,2). Suppose that experience somehow shows that @, # @,. This 
implies that for some arbitrary vector field V on .4, or on part of 4, 
@,(V) # @,(V). Since V = Vx'd/éx', for any Lorentz chart x, we have that 
@,(V) = Vx'w, (6/éx') = Vx'w, (d/dx') = @,(V), a contradiction. 


4.4 Concept Mutation at the Birth of Relativistic Dynamics 


Let x be a particle moving with velocity u and acceleration a in an inertial 


frame F. Classical dynamics represents 7’s “quantity of motion” or momentum 
by the vector 


P= mu (4.4.1) 
(where m is the particle’s “quantity of matter” or mass). The external force 


acting on 7 is represented by the vector k, which, by Newton’s Second Law of 
Motion, equals the time rate of change of p: 


y= UPL Aer) 
“dt dt 


(4.4.2) 


As the action of the force on an otherwise isolated particle does not alter m, 
d(mu)/dt = mdu/dt = ma, and the mass equals the ratio of force to acceler- 
ation. Mass is thus a measure of the particle’s “laziness” or inertia, 1.e., of its 
resistance to the accelerative action of a given force. p and k are tied, as 
Cartesian vectors, to a particular Euclidean space, the relative space S; of 
frame F. Nevertheless, they display a certain absoluteness, that argues for the 
Newtonian physicist’s belief that they both stand for substantive, frame- 
independent physical realities. Thus, the force on z is represented in the spaces 
of any two inertial frames, F and F’, by two vectors, say, k and k’, which have 
respectively the same components relative to any pair of Cartesian charts on 
the pertinent relative spaces whose coordinate axes coincide at some time, as F 
and F’ move past each other (p. 19). k and k’ are thus, in a sense, “the same 
vector”, and, as far as I know, no classical physicist ever bothered to distinguish 
them. Consider now the momenta p,, relative to the inertial frame F, of the 
particles 2; in an isolated material system S. Newton's Third Law implies that 
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the total momentum Z,p, of S relative to F is conserved through time. An 
inertial frame Fy can always be found, relative to which the total momentum of 
Sis 0. If F moves in Fy with velocity u, £;p; = (2;m,)u, where m, is the mass of 7; 
and hence 2m; is the total mass of S. Consequently, if the inertial frame F’ 
moves in F with velocity v, the total momentum of S relative to F’ is 


Lip; = (2m) (u+v) = Z;p,+ (2,m,)v (4.4.3) 


These simple and natural relations bespeak the reality of momentum: though 
its vectorial representation varies from frame to frame, the difference does no 
more than reflect the change of frame. 

The classical definition of momentum (4.4.1) and the Law (4.4.2) that 
determines the classical representation of force both depend essentially on 
differentiation with respect to time and are therefore meaningless until the 
meaning of time is fixed. If time is frame-dependent, (4.4.2) cannot yield a 
frame-independent representation of force in the manner described above. 
Moreover, velocities will not transform according to the Galilei Rule, as in the 
calculation underlying (4.4.3). Classical dynamics is firmly bound to absolute 
Newtonian time and cannot survive it. Special Relativity had to develop a new 
dynamics. 

In view of the experimental success of the Newtonian theory, relativistic 
dynamics had to agree well with it when low speeds were involved (i.e, in the 
case of our particle x, when u <1 =c). This condition constrained the 
problem, but did not yield the answer. A solution could only be found 
poetically—as Newton had found his—by devising new concepts, and building 
with them a new framework for judging and describing facts. As on other 
similar occasions in the history of thought, the new wine was first served in old 
bottles, the time-honoured words “mass”, “momentum”, “force”, being now 
used to convey unfamiliar ideas. Stunned readers learned that force was 
relative and that the mass of a material object would increase dramatically bya 
mere transformation of coordinates. No wonder Relativity was resisted by 
some pious ideologues, while others hailed it as bringing about the “de- 
materialization of matter”. We cannot dwell on the historical details of the 
genesis of relativistic dynamics; but a short discussion will be necessary to 
motivate our subsequent sketch of the four-dimensional formulation that 
finally threw light on the meaning of the new conceptual system and on its link 
with the old.’ 

In his (1905d), § 6, Einstein assumes that Maxwell’s equations for free space 
hold good in any inertial frame, provided that the coordinate system to which 
they are referred is a Lorentz chart adapted to the frame in question. The 
assumption was justified insofar as Maxwell’s equations had been suggested 
and confirmed by experiments referred to laboratory frames moving almost 
inertially with significantly different velocities relative to the fixed stars. It was 
also consistent with the form of the equations; indeed, the Lorentz transform- 
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ation had first turned up in the quest for a “substitution of variables” that 
would leave the Maxwell equations unchanged in form. In the classical 
formulation adopted by Einstein, Maxwell’s equations for free space equate 
the time rate of change of the electric field strength E with the curl of the 
magnetic field strength H, and vice versa. E and H are time-dependent, 
divergence-free Cartesian vector fields in the relative space of an inertial frame 
F. The time derivative and the curl are taken with respect to the appropriate 
coordinate functions of a Lorentz chart x, adapted to F, and its associated 
Cartesian chart X. We agree that ¢,,, is equal to | or —1 if {a,f,y) is, 
respectively, an even or an odd permutation of {1, 2, 3}, and is equal to 0 if any 
two of the indices are equal. If E, and H, denote the «-th components of Eand 
H relative to x, the equations can be concisely written thus:? 


OE, 0H, oH 
ax? “FY BY Axo “FY Bx 


(4.4.4) 


If y is a Lorentz chart adapted to another frame F’ and related to x by the 
transformation equation (3.4.18), the Maxwell equations 


OE, OHs OH, CER 

ay? = Eapy ay? _ ayo = aby Br (4.4.5) 
will hold good provided that 

EG = E, H', —_ H, 


E', = B(E, —vH;) HH’, = B(H,+0vE;) (4.4.6) 
E’, = B(E,;+0H,) H's = B(H; —vE;) 


The E’, and H’, can be naturally regarded as the components, relative to j, of 
two Cartesian vector fields, E’ and H’, on the relative space of F’.* Nothing 
would mar the beauty of this result if its meaning were clear. It so happens, 
however, that the electric and magnetic field strengths in Maxwell’s equations 
were conceived until 1905 as fields of (Newtonian) force. Thus, the value of Eat 
a point Pin S; was supposed to stand for the force that would be exerted by the 
field on a particle with unit charge at rest at P.* But Eand E’, if related to one 
another by (4.4.6), cannot be the representations, in the relative spaces of two 
different inertial frames, of the same Newtonian force field. We may take this 
as one more indication that the Newtonian concept of force was voided by 
Relativity. But then the electromagnetic field strengths E and H no longer can 
have their original meaning. 

Einstein deals expressly with the interpretation of equations (4.4.6) in the 
second half of §6.° The force on a particle x momentarily at rest in the inertial 
frame F can be balanced and measured by a dynamometer tied to F. The ratio 
of its measured magnitude to the magnitude of the acceleration a that 2 
experiences in F as the force acts on it, is the inertia of z in its own mirf. We call 
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it the particle’s proper mass and denote it by m,. The force on z will be uniquely 
represented in the relative space S; by the vector k of (4.4.2) if t agrees with the 
time coordinate function of a Lorentz chart adapted to F and we set m = mo. 
The electrostatic unit of charge can be defined as usual in terms of this concept 
of force. (Section 2.1, note 4.) The electric field strength Eat a point Pin S;can 
now be understood to be the force in the present sense exerted by the field ona 
particle with unit charge at rest at P. If the vector field E, defined on S;, 
remains constant or changes smoothly in time, equations (4.4.4) determine the 
vector field H, defined on S; for each instant.° The force exerted by the 
electromagnetic field (E, H) ona particle of unit charge moving through Sr is, 
by definition, a vector at the relevant point of S;, with the same components, 
relative to matching Lorentz charts,’ as the force exerted on the particle in its 
mirf by the electric field obtained from (E, H) by the appropriate transform- 
ation, in agreement with (4.4.6).® 

Einstein uses the foregoing concepts to calculate the inertia of a charged 
particle moving in an electromagnetic field. Let g and mg be, respectively, the 
particle’s charge and proper mass. Einstein assumes that q is the same in all 
inertial frames.” We denote the particle by z, its mirf by F’, and suppose that 
the electromagnetic field (E,H) is defined on the relative space S- of the 
inertial frame F, in which x moves with velocity v. One can always find two 
matching Lorentz charts x and y (with associated Cartesian charts x and jy), 
adapted respectively to F and F’. x and y satisfy equations (3.4.18), with 
v =|v|. The sources that generate the field (E,H) in S,, with components 
E,, H,, relative to the chart xX, generate in S,. the field (E’, H’), whose 
components E’, Hj, relative to y, are given by (4.4.6). The acceleration a’ of zin 
its mirf F’ satisfies the equation 


ma’ = qE’ (4.4.7) 


Consequently, by (3.4.18) and (4.4.6), the components a,, relative to x, of z’s 
acceleration in F must satisfy: 


mo Ba, =qE, 
Mo B* a, = gB(E, —vH3) (4.4.8) 
my Ba; = qB(E3; +H) 


The quantities on the right-hand side of (4.4.8) are by definition the com- 
ponents (relative to X) of the force with which the field (E, H) acts on z. The 
ratio of the force components to the acceleration components is plainly greater 
than the proper mass my. It is my 8° for the component parallel to the direction 
of motion, and m, f? for the components perpendicular to that direction. The 
resistance opposed by a moving particle to the accelerative action of an 
electromagnetic force depends therefore on the magnitude of the particle’s 
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velocity and on the angle between the velocity and the force. Einstein points 
out that this result applies even to uncharged particles, for it does not depend 
on the amount of charge.’° It must apply also to every kind of force: as they are 
all liable to be balanced by electromagnetic forces, they must have the same 
effect on matter. 

Einstein calculates the kinetic energy 7 acquired by our particle 2, with 
charge q, accelerated from rest by the field (E, H) until it moves in F with 
velocity v. He assumes that z is accelerated slowly and thus loses no energy by 
radiation. As the force perpendicular to the direction of motion does no work 
on 7, 


r= {ae,ax = my Br vdv = mo(B-1) (4.4.9) 
0 


As v approaches 1, T grows beyond all bounds. Hence the particle cannot 
attain the speed of light. If v < 1, 


x tmv? (4.4.10) 


the classical value of the kinetic energy. 

In his (1905d ), § 8, Einstein shows that if Lp is the energy content ofa train of 
parallel plane light waves referred to an inertial frame F, its energy content 
referred to a frame F’, moving in F with constant velocity v, at right angles with 
the direction of propagation of the wave train, is equal to Lf. This result is 
used by Einstein in the short paper (1905e}—published soon after his 
(190Sd }—for establishing a most remarkable proposition. Einstein assumes 
that the Principle of Conservation of Energy holds good in all inertial frames 
(“according to the Relativity Principle”, he says). Then, if a body at rest in an 
inertial frame F simultaneously emits two trains of plane light waves of energy 
L/2 in opposite directions, it must lose the energy L in F, and the energy Lf in 
the frame F’ that moves in F with velocity v at right angles to the direction of 
emission. In the carefully contrived conditions of this Gedankenexperiment, the 
body will not recoil as it emits the light, but will remain at rest in F, and moving 
with speed v in F’, Its state of motion remains, so to speak, kinematically 
unchanged by the emission of radiation, though surely the loss of energy must 
alter it dynamically. Let +, and t, be times immediately preceding and 
following the emission and let Q, and Q’, denote the energy content of the body 
at time t,in F and F’ respectively (r = 1, 2). If, with Einstein, we understand by 
the kinetic energy of a body (with respect to a given inertial frame) the full 
increment in the body’s energy due to motion (in that frame), it is plain that the 
kinetic energy 7, which our body possesses at time ¢, with respect to the frame 
F’ is equal to (Q', —Q,), up to an arbitrary constant C. Since C will not change 
by the emission of light, it follows that 


T,-T, = L(p-1) (4.4.11) 
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We may of course choose vas small as we wish. If v < 1, the kinetic energy lost 
by the body in F’ as it radiates the energy L in its rest frame is 


AT = 4 Lv? (4.4.12) 


Comparing (4.4.12) with (4.4.10) we see that when a body at rest loses the 
energy L its proper mass will decrease by the same amount (viz. if c = 1; 
otherwise Am, = L/c?).!! As it is not essential that the energy be lost by 
radiation, Einstein draws the general conclusion that “the mass of a body is a 
measure of its energy content”.'? Of course, this conclusion does not follow 
from his premises. Even if all the energy released by a body has to be subtracted 
from its mass, the remainder could be more than the body’s extant energy 
store. However, the conversion of colliding particle—antiparticle pairs into 
light—unsuspected in 1905—lends unexceptionable support to Einstein’s 
dauntless extrapolation. If the energy content of a particle at rest is equal to its 
proper mass my and its kinetic energy when travelling with speed v is equal to 
mo B —mo, the total energy of the particle relative to an arbitrarily chosen 
inertial frame F is 


m(u) = mB, (4.4.13) 


where wu is the speed of the particle in F and B, = 1/ \/1 —u?. For historical 
reasons that will soon become apparent, m(u) is usually called the relativistic 
mass.'3 

Though Einstein’s 1905 definition of force has not been proved incon- 
sistent,'* and though he relied on it on his way to his most famous discovery, it 
was soon supplanted by another, in a way more natural definition, introduced 
by Planck (1906). While Einstein had procured a unique—and hence 
absolute—representation of the force accelerating a particle in each inertial 
frame by equating it formally, componentwise, with the force defined 
physically according to Newton’s Second Law, in the particle’s mirf, the new 
definition furnished a physically warranted representation of the force, 
peculiar to each frame. One simply substitutes m(u), from (4.4.13), for m in 
(4.4.2) and defines the vector k as the relative force on the particle in the inertial 
frame to which its velocity u is referred. 


p= m(u)u (4.4.14) 


is adopted as the relativistic definition of momentum. An argument traceable to 
Lewis and Tolman (1909) and now included in many textbooks'® proves that if 
the Principles of the Conservation of Energy and Momentum hold for an 
isolated system of freely colliding but otherwise unrelated particles—that is, if 
2,m(u;) = const. and X,m(u;)u; = const., where u; is the velocity of the ith 
particle of the system at any appointed time in any given inertial frame—m(u) 
must be the function defined by (4.4.13). 
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One of the most satisfactory features of the concept of relative force is that 
the time rate of increase of the energy of a particle subject to the force, relative 
to the frame to which the latter is referred, equals the power or rate at which the 
force does work on the particle. If k is the relative force and the particle’s 
velocity is u, the power is 


d du du 
u:k = we m(uu = mo By +? mo Bou 
ur du 3. du 
mobal +5 ae mo Buus 
= Ob a (4.4.15) 
~ dt vi 


The result concerning kinetic energy derived in (4.4.9) can now be obtained, by 
integrating (4.4.15), without appealing to Maxwell’s equations. As 


k = — m(u)u = m(u)a + u(u- k) (4.4.16) 


it is clear that the relative force k and the acceleration a = du/dt effected by it 
are linearly dependent if and only if k is orthogonal to or linearly dependent on 
the velocity u. The traditional concept of inertia or ratio of force to 
acceleration has now a definite meaning only in these two cases. Moreover, in 
each case it has a different value. We must distinguish (i) the transverse mass 
my B,, Or resistance opposed by a particle moving with velocity u to a force 
perpendicular to its direction of motion, from (ii) the longitudinal mass m, B?, 
or resistance opposed by a particle moving with velocity u to a force parallel to 
its direction of motion. Only the former is equal to the energy of the particle, 
and is often called its “inertial” or “relativistic” mass. (Case (i) includes of 
course the resistance my offered to any force by a particle at rest: if u = 0, it is 
orthogonal to every vector.) 

The “transverse mass” defined as the ratio of transversal relative force to 
acceleration differs from the transverse inertia calculated by Einstein from his 
1905 concept of force.'© We must therefore give a different reading of 
equations (4.4.8): the components of the relative force exerted by the 
electromagnetic field (E, H), in frame F, on the particle, are gE,, q(E, —vH;) 
and q(E; + vH,). The relative force is thus none other than the Lorentz force 


k = q(E+vx H) (4.4.17) 


The Lorentz force law (4.4.17) holds in each inertial frame as a corollary of 
Einstein’s RP and Maxwell’s equations. 
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4.5 A Glance at Spacetime Physics! 


Spacetime physics began with the observation that some of the Cartesian 
vectors of relativistic mechanics and electrodynamics can be regarded as the 
spacelike parts, relative to the appropriate frames, of 4-vectors whose timelike 
parts are familiar scalars, or can be combined in pairs to make a 6-vector. I 
noted on page 86 that Poincaré had already been aware of this, and 
I mentioned the current density vector and the charge density scalar as a case in 
point. Consider a static, continuous charge distribution referred to an inertial 
frame F. Let p(t, r) be the charge density at point rand time ¢. p is obviously a 
time-dependent scalar field on S; and can therefore be identified with a scalar 
field on .&@ (p. 103). We define a 4-vector field J by giving its components 
(p, 0, 0, 0), relative to a Lorentz chart x, adapted to F. The spacelike part of J 
relative to F is 0 and is therefore equal to the current density in F, while the 
timelike part is of course the charge density p. To calculate the components of J 
relative to a Lorentz chart y, matched with x and adapted to a frame F’, 
moving in F with speed v, we simply multiply (p, 0, 0, 0) on the left by the 
Jacobian matrix (3.4.18’). We obtain: (8p, —Bpv,0,0). By the “length 
contraction” equations (3.5.1), the charge density in F’ equals Bp, while 
(— Bpv, 0, 0) are precisely the components of the current density in F’, relative 
to the Cartesian chart y, associated with y. The 4-vector J is aptly called the 
4-current density. 

An even more striking example is provided by the 4-momentum P of a 
particle x of proper mass my and 4-velocity U. Its components, relative to a 
Lorentz chart x, adapted to a frame F, are defined by 


Pi = mUi (4.5.1) 


Hence, by (4.3.5), P° = moB,, and P* = mo8,,u", where u is the speed of x in F 
and the u* are z’s velocity components relative to the Cartesian chart x, 
associated with x. In other words, the timelike part of P, relative to F, 1s 1’s 
energy m(u), while its spacelike part is 7’s momentum m(u)u (4.4.13-14). The 
Principles of Energy and Momentum Conservation turn out to be no more 
than the phenomenological, frame-dependent expression of the deeper, 
absolute Principle of the Conservation of 4-Momentum. 

The 4-momentum P, which incorporates 2’s energy and momentum in a 
single geometrical object, epitomizes the particle’s dynamical state. Hence, if 
an external force on z is to be measured, as in classical physics, by the rate at 
which it changes the particle’s dynamical state, its components, relative to the 
Lorentz chart x, must be given by 


Ki = dP/dt (4.5.2) 
It is not hard to show that the Kare indeed the components of a 4-vector, the 


4-force K, introduced by Minkowski in 1908.? Relative to the frame F, in which 
our particle moves with speed u, the spacelike part K of K is f, times the 
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relative force k or rate of change of z’s momentum, while its timelike part K° is 
B,, times the power k- wor rate of change of z's energy. Note that if F happens 
to be z’s mirf, K = k and K°® = 0. Thus Minkowski achieves in effect with 
his 4-force what Einstein in 1905 had sought in vain: a physically 
motivated, frame-independent representation of the force on a particle, 
determined by the Newtonian force referred to the particle’s own mirf 
(cf. p. 110). 

Let us now consider a continuous distribution of matter. For each 
worldpoint P there is an inertial frame Fp in which the distribution is 
practically at rest in a small neighbourhood of P. We call Fp the local inertial 
comoving frame (licf) at P. Let 6 V° denote an infinitesimal “element of volume” 
in the licf, while 6 V denotes the volume it takes up in the relative space $, of an 
arbitrary inertial frame F. Then, by (3.5.1), if the local speed of the material in F 
is u, OV° = B OV. The external force on the material is aptly represented, 
relative to F, by a time-dependent Cartesian vector field on S,., the force density 
f, which is defined as follows: the relative force acting at time ¢ on the volume 
element 6V about a point rin S, is equal to f(t, r)6V. Suppose that the material 
in OV at the time in question moves in F with velocity u. The spacelike part K, 
relative to F, of the 4-force K on OV, is then equal to £,f(t, r\oV, while its 
timelike part K° equals Bu: f(t, r)6V. Clearly, then, (K°, K) = (uf, f)dV°. 
Consequently, f and u-f are, respectively, the spacelike and the timelike 
parts, relative to F, of a 4-vector field, the 4-force density f. Thus, we have 
that 


K = fov® (4.5.3) 


Few chapters in science and philosophy can match the beauty of the four- 
dimensional formulation of classical electrodynamics. The key to it was 
Minkowski’s discovery that the electric and magnetic field vectors result from 
the decomposition of a 6-vector relative to an inertial frame. Since we conceive 
6-vectors as skew-symmetric 4-tensors (p. 103), we shall refer to the electro- 
magnetic field tensor. We denote it by F. The electric and magnetic fields E and 
H of classical electrodynamics are none other than its frame-bound facets, 
whose duality conceals its unity, though at the same time they hint at it by their 
curious interplay. Let x be a Lorentz chart adapted to the frame on whose 
space the fields E and H are defined, and let E, and H, be the components of E 
and H relative to the associated Cartesian chart x. The components relative to 
x of the electromagnetic field tensor F in its contravariant form are then given 
by:3 


(Fi) =] (4.5.4) 
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Aneasy calculation shows that the components F''‘/ of F relative to the Lorentz 
chart y, matched with x by equations (3.4.18), are obtained by replacing in the 
above matrix the unprimed values E,, H,, by the primed values E), H%, 
obtained from them using equations (4.4.6). The latter are, of course, the 
components, relative to y, of the electric and magnetic fields E’ and H’, defined 
on the space of the frame to which y is adapted. The Maxwell equations for 
free space, in the presence of convection currents can then be elegantly stated 
as follows: 


ghaki 


OF OF’ : 
aon 0 aa 4nJ (4.5.5) 
Here the F;; are the components, relative to x, of the covariant form of F 
(note 3), the J‘ are the components of the 4-current density, and we set ¢'# 
equal to 1, —1 or 0, depending on whether hjki is an even or an odd 
permutation, or no permutation at all, of {0, 1, 2,3}.° 

Let x be a test particle of charge gq and 4-velocity U, whose worldline 
traverses the electromagnetic field F. We want to calculate the components 
relative to a Lorentz chart of the 4-force K exerted by the field on z. Consider 
first a Lorentz chart adapted to z’s mirf. Relative to it, U° = 1, U7 =0,K° =0, 
and K* =k, = gE, (by the definition of E). Hence the components of the 
4-force on 2, relative to such a chart, satisfy the equations 


Ki = qU*Fi (4.5.6) 


(where the Fi. are the components of the (1, 1) mixed form of F——see note 3). As 
(4.5.6) equates 4-vector components it remains valid if referred to an arbitrary 
Lorentz chart (page 305, note 7). A short calculation shows that, for i = 1, 2 
or 3, (4.5.6) merely restates the Lorentz force law (4.4.17). For i = 0, it gives the 
equation of energy that follows from (4.4.17). 

Consider now a time-dependent continuous charge distribution acted on by 
the field F. As on page 115 we define at each worldpoint P a licf Fp relative to 
which the charge distribution is practically static in a small neighbourhood of 
P. If 5V° denotes the comoving element of volume and p is the licf charge 
density, the amount of charge contained in 6V° is equal to pd6V°, so that, by 
(4.5.3) and (4.5.6), the components of the 4-force density f relative to a Lorentz 
chart adapted to the licf satisfy the equations: 


fi = J*Fi (4.5.7) 


Again, these equations remain valid if the tensorial components on either side 
are referred to an arbitrary Lorentz chart. By a well-known theorem of 
classical electrodynamics, the relative force density f on a charge distri- 
bution—1.e. the spatial part of the 4-force density f relative to a given inertial 
frame—stands in a simple relation to the Poynting vector S = (1/4z)Ex H 
and a symmetric Cartesian tensor s, constructed from the electric and magnetic 
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field strengths E and H. s is known as the Maxwell electromagnetic stress 
tensor, for Maxwell, who first introduced it, conceived it—on the analogy of 
the mechanical stress tensor (page 118}—as representing the stresses of the 
aether, the putative bearer of electromagnetic action.® The components of s, 
relative to the Cartesian chart X, associated with a Lorentz chart x adapted to 
the inertial frame to which E and H are referred, are given by 


Sap = —(1/4n)(E, Eg + Hy Hy — (1/2)(E? + H?)6,,) (4.5.8) 


(Note that s is a symmetric tensor.) The components of the force density f 


relative to X are then 
oS, @ 
f= -( ve “4 ) (4.5.9) 


6x® ax? 


By another well-known theorem, if u is the velocity field of the charge 
distribution in the chosen frame, the equation of energy is 


Co 60S 
f-u= -(5+35) (4.5.9b) 


where o = (E? + H”)/8z is the energy density of the electric and magnetic 
fields. The left-hand sides of equations (4.5.9) are equal to the components, 
relative to x, of the 4-force density f. Substituting 0/dx? for 6/0x* we obtain 
on the right-hand side the components of a 4-vector. Setting S°° = o, 
$°? = $79 = §,, and S* = 5s,,, the relation between this 4-vector and the 4- 
force density f can be stated as follows: 
os! 
tao (4.5.10) 


From (4.5.10), (4.5.7) and the second set of equations (4.5.5), we conclude that 
ost 1 OFM 
= F! 


dxi an ax Gann) 
Equations (4.5.11) can be solved to yield:’ 
a: 1 ss oes 7 
Si = — 7 (FF) —399F yF") (4.5.12) 


The S‘/ are therefore the components, relative to x, of a symmetric 
contravariant 4-tensor field of order 2, which we shall denote by S.° This 
geometrical object known as the electromagnetic energy tensor, splits into three 
time-dependent fields on the relative space of any given inertial frame, namely, 
the energy density o, the Poynting vector S, and the Maxwell stress-tensor s, 
which are thus seen to constitute the frame-dependent breakdown of a single 
physico-mathematical entity. 


RAG - E 
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Following the discovery of the electromagnetic energy tensor,? Max von 
Laue (1911) inferred from the Relativity Principle that “in every domain of 
physics it must be possible to combine the force density and its power” into a 
4-vector field. To each such 4-force density f there are infinitely many 4-tensor 
fields S that stand to it in the relation expressed by (4.5.10) (see note 7). Laue 
assumed that “in each domain of physics there is one among such 4-tensors 
whose components have a physical meaning corresponding to that of the 
components of the said electromagnetic tensor”.!° Laue postulated moreover 
that this physically significant solution of (4.5.10) would be, in each case, 
a symmetric 4-tensor field. As the diverse “domains of physics” interact in a 
given spacetime region, their respective tensor fields can be added to define a 
single symmetric tensor field, representing the dynamical state of the whole 
physical system that fills that region. As an illustration of these ideas, let us 
define the energy tensor of a material continuum, constructed by Laue from 
the energy and momentum densities and the mechanical stress tensor.!' Let 2 
denote a continuous distribution of mechanically interacting matter. In the 
relative space S; of an inertial frame F the presence of & is indicated by a time- 
dependent scalar field, the energy density yz, while X’s motion is described by a 
time-dependent Cartesian vector field, the velocity field u. Each point of © is 
subject to stresses caused by the surrounding material. They are represented by 
a time-dependent Cartesian tensor field on S;, the mechanical stress tensor t. 
Let x be a Lorentz chart adapted to F and let X be the associated Cartesian 
chart on S. The nine components f,, of t, relative to x, are defined as follows: 
Let Qf be a small square centred at a point r of S; and normal to the B-th 
parametric line of x through r; then, the value of f,, at r,at time ¢, is equal to the 
a-th component, relative to xX, of the force per unit area exerted on Q?, at that 
time, by the material pressing it from the side on which the coordinate function 
x* is smaller. If F happens to be the licf of our matter distribution at 
worldpoint (71), tag (t, ©) = tg,(t,r), and the local value of t is a symmetric 
tensor.'* We assume that £ is subject to no external forces. The force density at 
each point is then 0, and the energy of the system is conserved. The equations 
of motion are given by!? 


Chu, 6 
Axo + ap lap + pUu,ug) = 0 (4.5.13a) 


while the conservation of energy is expressed by the equation of continuity: 


OH Opy 
Ox® © Axe 


0 (4.5.13) 


The striking similarity of (4.5.13) with (4.5.9) suggests that we substitute x* for 
x* and set M°° = p,M° = M”° = wu,, and M” = (t,g + wu,ug). (4.5.13) can 
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be read after (4.5.10), as a statement about the 4-force density on 2: 


aM 
7 = 9 (4.5.14) 


The M‘/ are the components of a contravariant 4-tensor field of order 2, the 
energy tensor of the material continuum Z, which we denote by M.'* The 
physical significance of equation (4.5.14) can be brought out by rewriting it as 
@M'°/éx° = —6M‘*/éx*, and integrating both sides over an arbitrary volume 
V bounded by a surface S in the relative space of frame F. In the light of our 
definitions it is clear that M‘’, M'? and M® are the components, relative to X, 
ofa Cartesian vector field we shall denote by M‘. The value of M° at each point 
of space equals the local energy flux, while that of M7 is the local flux of 
momentum in the direction of increasing x*. (Remember that force equals 
momentum transfer per unit time.) We have, then, that 


d ‘ aM’ Mi : 
“| M®dv = aves | OM gr a-a'| vote 
dx® J, y Ox® i OM - 


(4.5.15a) 


By Gauss’s Theorem, 
-| V-MidV = - | Minds (4.5.15b) 
Vv Ss 


(where n is an outward directed unit vector normal to the element of surface 
dS). Equating the left-hand side of (4.5.15a) with the right-hand side of 
(4.5.15b) we see that the time rate of increase of each component of 4- 
momentum in V equals its inward flux across the boundary S. Equation 
(4.5.14) thus expresses the conservation of 4-momentum for a continuous 
distribution of mechanically interacting matter. The M‘/ are the components 
of a contravariant 4-tensor field of order 2, the energy tensor of the material 
continuum 2%, which we denote by M.'* Following Laue, we prove that M is 
symmetric by taking the components of its value at any worldpoint, relative to 
a chart adapted to the licf. Evidently M°* = M”° = 0 (for the velocity field is 
locally 0 in the comoving frame), while M*? = ty, = ts, = M** (for t, as we 
noted above, is symmetric in the licf). The symmetry of M follows (p. 305, 
note 7).!° 

Asa further illustration of these ideas let us assume that the continuum Z isa 
perfect fluid, or in other words that at each worldpoint the pressure is isotropic 
(i.e. the same in all directions) in the licf. Relative to a Lorentz chart adapted to 
the licf Fp at a given worldpoint P, the components of the energy tensor M 
have the local values 


M°=p M*=M®=0 M*=pé,, (4.5.16) 
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where p and p are the local (proper) energy density and the local pressure, 
respectively. To calculate the components M‘/ of M relative to the inertial frame 
F in which the motion of the fluid £ is represented by the velocity field u we use 
the fact that, due to the spatial isotropy of the M*, we may assume, without 
loss of generality, that F is related to Fp by the pk Lorentz transformation 
(3.4.19}with the local value of —u substituted for the velocity v—plus, if 
need be, a translation which would make no difference to our present purpose. 
Denoting the typical element of the Jacobian matrix of the transformation by 
Ai (see (3.4.19), we have that M'/ = Ai,AjM™. Consequently: 


M° = (p+pu’)B2 0 M*°=(p+p)u7B2 MM? = pdag + (p+ p)uruP Be 
(4.5.17) 


where By = 1/,/1 —u’. If we represent the motion of the fluid by the 4-velocity 
field U, with timelike and spacelike parts, relative to F, given by 


U°=8, U=uf, (4.5.18) 


we can translate (4.5.17) into the following concise expression for the 
components of the energy tensor of a perfect fluid: 


Mi pe (p + pyU'Us —pnis (4.5.17*) 


Suppose now that X is a continuous distribution of charged matter 
interacting with the electromagnetic field F. The 4-force density on Z no longer 
vanishes, but is given, relative to the Lorentz chart x, by (4.5.10). Consequently, 


ies (Mv +S) =0 (4.5.19) 
Ox? 

But M+ S‘/ is precisely the (i, ;)-th component, relative to x, of the sum of the 
tensor fields M and S. If 2 is subject to other external forces, the symmetric 
tensor fields corresponding to each of them according to Laue’s assumption 
must also be added to M, in order to keep the right-hand side of (4.5.19) equal 
to 0. Let T=M+S+ ...denote the sum of all such tensor fields. We call T 
the total energy tensor. Its components, relative to a Lorentz chart x, satisfy the 
equations 

6TY 
éx! 


=0 (4.5.20) 


This law is to the physics of continua what 4-momentum conservation is to 
particle dynamics. It will play a cardinal role in connection with Einsteins field 
equations of gravitation. The latter, in turn, vindicate the postulated symmetry 
of the total energy tensor and its summands.'® 

Had we not chosen our units so that c = 1, diverse powers of c would turn 
up as factors of the timelike or the spacelike parts of our 4-vectors, and of some 
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or all the components of our 4-tensors of higher order. What components 
must be multiplied by what powers of c will depend on the definition of the 
geometrical objects in question, and some freedom may be exercised in this 
respect. Such definitions are usually chosen on practical or aesthetical grounds. 
Our choice can also be fruitfully guided by the defined object’s behaviour at 
the Newtonian limit, c > oo. W. G. Dixon (1978) consistently takes the latter 
approach, arriving at some very interesting comparisons. 


4.6 The Causal Structure of Minkowski Spacetime 


In a Newtonian world, a massive particle can attain any speed relative to a 
given inertial frame. Moreover, a signal can, in principle at least, be transmitted 
instantaneously from one spatial point to another, e.g. by poking at the 
addressee with a rigid rod. Neither possibility is available in the world of 
Special Relativity, where no signal can travel faster than light in vacuo, and 
massive particles cannot even go as fast as that. We saw in Section 3.7 that this 
feature of Special Relativity is essentially linked to the Lorentz group. We may 
therefore expect that it has an important bearing on the geometry of 
Minkowski spacetime. To see just how important this is we need some 
definitions. 

We say that a worldpoint P causally precedes a worldpoint Q (symbolized: 
P< Q)if P and Q are viable locations for, respectively, the emission and the 
reception of a signal. If the signal in question can be relayed by massive 
particles, we say that P chronologically precedes Q(P <« Q). Both kinds of 
precedence are obviously transitive relations. To simplify some statements 
we stipulate that causal precedence is also reflexive (for every worldpoint 
P, P < P). If P precedes Q causally, but not chronologically, we say that P has 
the horismos relation with Q(P > Q). P and Q are causally connectible if either 
P<@Q or Q < P. “Chronologically connectible” and “horismotically con- 
nectible” are defined analogously. If P and Q are not causally connectible we 
say that they are separate worldpoints (symbolized: P||Q). P’s causal future, 
J*(P), is the set of ail worldpoints that P causally precedes. Its causal past, 
J” (P), is the set of all worldpoints that causally precede P. P’s chronological 
future and past, 1*(P)and I~ (P), and its future and past horismoi, E* (P)and 
E~(P), are defined analogously. Note that, by definition, P lies in the 
intersection of its causal past and future; also, its future (past) horismos is the 
complement, relative to its causal future (past), of its chronological future 
(past). The causal future J *(U) of a set of worldpoints U is the union of the 
causal futures of the elements of U. The sets J~(U), I * (U), etc. are defined 
analogously. We denote the union of J*(P) and J~(P) by J(P). I(P), E(P), 
J(U), etc. denotes the unions of the appropriate pasts and futures. The 
worldpoints separate from P we denote by | (P). 
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These concepts apply to Minkowski and Newtonian spacetimes and, more 
generally, to any point set that may be supposed to furnish locations for 
physical events. Their properties and relations depend of course on the 
prevalent physics and can be used for giving a broad characterization of the 
latter. Thus, for instance, if Pisa point in Minkowski spacetime, the set E( P) of 
all points horismotically connectible with P, is the null cone at P (p. 92). E(P) 
is the common boundary of two disjoint open sets, namely, /(P) and || (P). 
While || ( P) is connected, [( P) consists of two components, I * (P) and I~ (P), 
whose boundaries are, respectively, E* (P)and E~ (P). Since P is the only point 
in the intersection of the latter, it is the only point on the boundary of both 
I*(P) and I~(P). P lies also on the boundary of ||(P), so that every 
neighbourhood of P contains a point separate from it. The partition of 
Minkowski spacetime into the three open sets [ * (P), I~ (P) and || (P) and the 
closed set E( P) depends uniquely on P. One the other hand, if P is a point in 
Newtonian spacetime, | (P)is empty and E( P) is the common boundary of the 
disjoint open sets I * (P) and I~ (P). Any point in E( P) is simultaneous with P 
and determines the same partition of Newtonian spacetime into the simul- 
taneity class E(P), its chronological future I* (E(P)) = I* (P) and its chrono- 
logical past I~ (E(P)). In other words, while each Minkowskian worldpoint 
has a chronological—and a causal—past and future all to itself, and is separate 
from—and yet near to—an open, connected region of spacetime, a Newtonian 
worldpoint is not separate from any other such point and shares its past and 
future with its contemporaries. This deep and far-reaching ontological 
difference between Newton’s and Einstein’s world follows immediately from 
the properties of the speed of signals and massive particles that we mentioned 
at the beginning of this Section. By definition, E(P) is the locus of all 
worldpoints causally, but not chronologically, connectible with P. In 
Minkowski spacetime E( P) contains, therefore, every location available for the 
emission or reception of direct light signals travelling in vacuo to or from P. 
Since, apart from P itself, no possible emission site can be a reception 
site, or vice versa, E(P} naturally falls into two 
parts, E*(P) and E~(P), the two sheets of a null cone, whose only 
common element is the vertex P. In Newtonian spacetime, E(P) is the locus 
of possible emissions and receptions of instantaneous signals to and from P 
and is therefore equal to the simultaneity class of P. Obviously, 
E(P) = E*(P) = E” (P). Inthe Newtonian world, in which distant interaction 
demands no time and the speed of massive particles does not have an upper 
bound, the past and future horismoi of a worldpoint P, i.e. the boundaries of, 
respectively, its past and future, collapse into a single 3-space, the locus of 
worldpoints simultaneous with P. We see at once that, while causal precedence 
defines a partial ordering of Minkowski, but not of Newtonian spacetime, 
chronological precedence defines a total ordering of the simultaneity classes of 
the latter. 
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A. A. Robb (1914) derived the entire geometry of Minkowski spacetime 
from an axiom system with only two primitives (undefined predicates), namely, 
“x is an element” and “the element x is after the element y”. Their intended 
meaning is, respectively, that x is a worldpoint and that y is different from x 
and causally precedes x.' H. Mehlberg (1937) proposed a different axiom 
system, in which the weaker, symmetrical relation of causal connectibility 
replaces causal precedence as a primitive.” Mehlberg’s philosophical remarks 
tend to foster the claim—of which, however, I do not find any trace in Robb— 
that such axiom systems effect a reduction of spacetime geometry to causality. 
To form a proper judgment of the validity, or, at any rate, of the meaning of 
this claim, one must bear in mind that, in order to yield the theorems of 
Minkowski geometry, Robb’s and Mehlberg’s axioms must impose on the 
primitives much stronger constraints than our ordinary concepts of causality 
have ever been supposed to entail. The two-place relations we have dubbed 
“causal precedence” and “causal connectibility” acquire thereby a wealth of 
connotations that their ordinary language homonyms do not possess. Since, 
moreover, the relations in question do not, in the intended interpretation, hold 
between actual causes and effects, but between the spacetime locations where 
such may occur, it is plausible to say that Robb’s and Mehlberg’s systems give, 
not a reduction of geometric to causal relations, but rather a set of geometric 
conditions with which the basic forms of physical causality must comply, if the 
principles of Special Relativity hold good. One may then claim indeed that, in 
Special Relativity, spacetime is nothing but—and no less than—“the ordering 
schema governing causal chains”,> which, so to speak, lays out the permissible 
avenues of physical action. But a geometric structure which thus regulates 
causality is not properly described as reducible to it. 

Following E. H. Kronheimer and R. Penrose (1967) we regard Minkowski 
spacetime, thus conceived, as a paradigm and a special case of a more general 
mathematical structure, called a causal space. Let S be any set, and let < and < 
be two binary relations on S. We define a third binary relation — by the 
condition: for P,QeS, P—Q if and only if P< Q and not P <Q. We 
designate <, <,and — by the same names as before and define the past and 
future sets J~(P), J*(P), 17 (P),...,J7 (U), etc. for each PES, U cS, as 
above. We say that (S, <, <) is a causal space if, for every P, Q, RES: 


(i) P < P. (Causal precedence is reflexive.) 
(ii) If P< Qand Q < R, then P < R. (Causal precedence is transitive.) 
(ii) If P <Q, then P< Q. 
(iv) P < P only if, for some X # P, P < X and X <P. 
(v) If P< Qand Q <R, then P <R. 
(vi) If P<Qand O<R, then P <R. 
(vi) If P<@Q and Q <P, then P=Q. (Causal precedence is anti- 
symmetric.}* 
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Let S be the underlying set of a causal space. If Sis endowed with a topology 
it makes sense to speak of causal, chronological or horismotical curves in S. 
A continuous mapping y:/ > S (where J is a real interval) is a future-directed 
causal curve in S if y(t) < y(t') whenever t < t’ (t, t'e€ I); y 1s a past-directed 
causal curve if y(t) < y(t’) whenever t’ < t. A curve y is said to be closed if it 
returns twice to the same point in S (i.e. if there are in J three real numbers 
t<t’ <t",such that p(t) = y(t”) # y(t’). Evidently, in a causal space there can 
be no closed causal curves. Chronological and horismotical curves of these 
several kinds are similarly defined. If S has a differentiable structure the curves 
under consideration may or may not be smooth. In sucha case we shall usually 
assume that the following condition of causal smoothness is fulfilled: any 
smooth chronological, horismotical, causal or acausal curve in S can be 
smoothly extended to a curve of the same kind beyond any given point in its 
range. A vector ata point Pe€S will be said to be causal if it is, acausal if it is not 
tangent to a smooth causal curve in S. A causal vector is horismotical if it is 
tangent to a horismotical curve, and chronological otherwise. Horismotical 
and chronological vectors are past or future-directed, depending on the nature 
of the causal curves to which they are tangent. Note that our definitions are so 
contrived that the zero vector is a horismotical vector of both types. The 
condition of causal smoothness implies that a vector v at P is future-directed 
causal, chronological or horismotical if and only if — visa past-directed vector 
of the same kind. Consider an arbitrary set A < S. If no two points of A are 
Joined by a chronological curve, A is said to be achronal. If no two points of A 
are joined by a causal curve, A is said to be acausal. A point P é Sis said to lie in 
the future domain of dependence of A (symbolically: P € D* (A)) if every past- 
directed maximally extended causal curve through P meets A. Similarly, P 
belongs to the past domain of dependence of A(PéD (A)) if every future- 
directed maximally extended causal curve through P meets A. The domain of 
dependence D(A) of A is the union of its past and its future domains of 
dependence. If A is acausal and D(A) = S, A is said to be a (global) Cauchy 
surface in S. 

Let (S, <, <) be a causal space. If P. Q, RES, and P<Q<«R, the 
intersection of J*(P) and I (R) is said to be a chronological neighbourhood of 
Q. The collection of all chronological neighbourhoods of every point PES 
constitutes a base of a topology, the Alexandrov topology of S. This is in a sense 
the natural causal topology of S.° Suppose now that S has a topology of its 
own, for instance, because it was first conceived as a differentiable manifold 
(e.g. Minkowski’s world) and subsequently endowed with a causal structure. If 
the Alexandrov topology defined by the latter is weaker than (i.e. is included in) 
the original topology of S we shall say that S is a topological causal space. Such 
is the case if and only if the chronological past and futures of every point of S 
are open in the original topology. The original topology 1s, moreover, identical 
with the Alexandrov topology if and only if the latter is Hausdorff (i.e. if and 
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only if any two distinct points of S lie in mutually disjoint chronological 
neighbourhoods). For this to be true it is necessary and sufficient that every 
neighbourhood of each P eS shall include an open neighbourhood of P into 
which no causal curve enters more than once.° A topological causal space that 
meets this condition is said to be strongly causal, or simply strong. 

If the relations of causal and chronological precedence are given the 
meanings explained on page 121, Minkowski spacetime is indeed a strong 
topological causal space, satisfying the condition of causal smoothness. On the 
other hand, Newtonian spacetime, does not meet the condition (vii}— for any 
neighbourhood of a given worldpoint contains a closed causal curve through 
that point—and therefore is not a causal space. The three precedence relations 
in Minkowski spacetime satisfy, besides (i}-(vii), some stringent additional 
conditions which specify its peculiar causal structure. Robb showed that the 
same axioms required for defining the unique causal structure of Minkowski 
spacetime suffice also for characterizing its affine and metric structures (the 
latter, up to a dilatation). Unfortunately, Robb’s elegant construction cannot 
be summarized in a few pages. In his study of “The Causal Theory of Space- 
time”, J. A. Winnie (1977) gives fairly simple definitions of Minkowskian 
straightness and congruence in terms of causal connectibility, but he does not 
spell out the conditions which his definiens must meet in order that the 
definitions may be satisfactory or even consistent. In the following sketch— 
which owes much to Winnie—I shall try to be explicit about the assumptions 
involved, even if, for brevity’s sake, I do not analyze them. Let there be given a 
causal space. I assume, in the first place, that its underlying set of worldpoints is 
a 4-manifold on its own right, diffeomorphic to R*, or, in other words, that it is 
Minkowski’s world .4.’ A necessary and sufficient condition for the given 
structure (.4, <, < > toagree exactly with the causal structure of Minkowski 
spacetime is the existence of a global chart x of .@ that satisfies the following 
requirement: for any two points P, Q€.#, such that x°(P) < x°(Q), the real 
number n; ;(x'( P) — x'(Q))(x/( P) — x/(Q)) is, respectively, negative, positive or 
zero if and only if P||Q, P <Q or P—> Q.° Let .@> be the tangent space at a 
point Pe. If the given causal structure is that of Minkowski spacetime, the 
future and past-directed horismotical vectors in .@p lie, respectively, on the 
two sheets ofa hypercone with the zero vector at its vertex, that is the common 
boundary of the two sets of acausal and chronological vectors in .@p. These 
two sets are non-empty and open, the former being connected, while the latter 
has two components, which comprise the future-directed and the past-directed 
chronological vectors, respectively. Consequently, any pair of linearly in- 
dependent causal vectors in .@p spans a subspace containing acausal vectors. 
On the other hand, a set of up to three linearly independent acausal vectors 
may span a subspace in which the only causal vector is 0. We say that a vector u 
is conjugate toa vector ve.M@pif u + vand u — vare horismotical vectors. By the 
condition of causal smoothness, if u —v is horismotical, —(u —v) = v —u is 
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horismotical too. Hence conjugacy is a symmetric relation. The set of all 
vectors conjugate to a given vector v is its conjugate class C(v). The reader 
ought to satisfy himself that C (0) is the hypercone of horismotical vectors; that 
if v is horismotical, C(v) is the subspace spanned by v; that if v is acausal, C(v) 
consists exclusively of chronological vectors, and that if v is chronological, C(v) 
is a set of acausal vectors which, together with v, span .@p. We shall say that 
two vectors u,vé .p are congruent if and only if (i) there is a vector we.“p 
which is conjugate to both u and », or (ii) there is a vector we.Mp which is 
congruent, by (i), to both u and v.° The concepts of conjugacy and congruence 
are all we need for defining, up to a choice of unit, the Minkowski inner 
product <, >in.@p. We pick out a chronological vector te. Mp, set (t,t > = 1, 
and <u,u> = —1 forany ueC(t). Since t and C(t) together span .@p, and, by 
bilinearity, for any KER, u, ve. p, (ku, ku> = k? Cu, u> and 2<u,v> = <u 
+v,u+v>—Cu,u>—<v,v >, our stipulation defines the inner product on all 
Mp. We see at once that the partition of .# into chronological, horismotical 
and acausal vectors, due to the causal structure, exactly matches its partition 
into timelike, null and spacelike vectors, by means of the inner product. 
Moreover, any otherwise defined Minkowski inner product on .#p that yields 
the same matching partition will differ from the above only by a constant 
factor, arising from the initial choice of a different unit vector. Since the 
proposed definition depends only on the topological properties of the 
partition of .@ into horismotical, chronological and acausal vectors, the 
Minkowski inner product can be introduced in the stated manner in the 
tangent space at any Pe.@, provided only that the given causal structure 
<M, <, <> sufficiently ressembles that of Minkowski spacetime on a 
neighbourhood of P (“sufficiently” meaning “enough to ensure that a 
partition with the said topological properties obtains in .#p”). But the inner 
products at each point cannot be welded into a single Minkowski metric field 
on .@ unless the given causal structure meets definite global conditions. Since 
we have chosen to avoid Robb’s painstaking attention to detail I shall put 
those conditions in a nutshell, thus: The causal structure must be so laid out 
that, for any Pe.@, the additive group of the tangent space .Mp can act 
transitively and effectively as a group of diffeomorphisms on .@ in such a way 
that the horismotical curves of the causal structure are precisely the straights 
generated under the action by the horismotical vectors (see p. 91). If a 
Minkowski inner product < , >» has been defined on .@>, for some Pe.%, in 
the manner indicated above, and H is an action of #,on.a which meets the 
foregoing requirement, H will propagate the inner product on .@, over all 4. 
Suppose that Q = H(u, P) for some ue.4,. Let H, denote the linear mapping 
of Mo onto ., by the differential of the diffeomorphism X +> H(—u, X). A 
Minkowski inner product <, >g is defined on M 0 by the condition that, for any 
v, WEMo, Cv, W 9 = (H,v, H,w)p. The mapping P++¢,>p, is then a 
Minkowski metric on .4. The horismotical curves of the causal structure are 
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the null geodescis of the chosen metric. Let a and b be two such curves, 
generated by the same null vector ué.@p. Suppose that there is a point Q ona 
and a null vector vé.@p such that H(v, Q) lies on b. The 2-flat through Q 
generated, under H, by u and v is then what Robb called an “acceleration 
plane”. It can be shown that all the straights generated under the action H are 
the intersections of acceleration planes. In particular, three points A, B,C €.4@ 
such that A < B < Clie ona straight generated by a timelike vector if and only 
if there are two distinct acceleration planes containing A, B and C. The same 
condition is necessary and sufficient for A, B and C to lie on a straight 
generated by a spacelike vector, if A||B||C||A. This provides a causal 
characterization of the spacelike and timelike straights of Minkowski 
spacetime.!° 

Perhaps the most striking expression of the link binding the causal and the 
affine metric structure of Minkowski spacetime is Zeeman’s Theorem: The 
group of causal automorphisms of Minkowski spacetime is the group 
generated by the orthochronous Lorentz group and the dilatations, that is, the 
group of orthochronous similarities of the Minkowski geometry. A causal 
automorphism of a causal space is a permutation g of its underlying set, such 
that, for any points P and Q, g(P) < g(Q) ifand only if P < Q. Chronological 
and horismotical automorphisms are similarly defined (substitute < or > for 
<). Clearly the causal automorphisms constitute a group. Let «.#, R*, H, f, > 
stand, as on page 92, for the Minkowski affine metric structure, with .@ an 
arbitrary set, H a transitive and effective action of R* on .@ and f, the 
quadratic form on R* defined on page 89. Two segments PP’ and QQ’ are 
equivalent in this structure if, for some ue R*, P = H(u, P’)and Q = H(u, Q’). 
Two straights a and bare parallel if a segment of ais equivalent to a segment of 
b (and hence also if aand b are generated by the same vector). The Minkowski 
causal space «4, <, <> is defined by the stipulation that, for any P, Qe. 
such that Q = H(u, P) for some ue R*, P < Q if f,(u) > 0 < uy and P>Q if 
f,(u) = 0 S ug. Let G be the group of causal automorphisms of this space. It is 
not hard to see that every orthochronous similarity of the affine metric 
structure belongs to G. To prove Zeeman’s Theorem we must show moreover 
that any géG is an orthochronous similarity. A. D. Alexandrov and V. V. 
Ovchinnikova proved in 1953 that a causal automorphism g €G which is also 
an affine transformation of Minkowski spacetime is indeed equal to a Lorentz 
transformation (modulo a dilatation).'! The main effort of Zeeman (1964) was 
directed at proving that every g €G is such an affine transformation. His proof 
rests on five lemmas. By Lemma 1, a permutation of .# is a chronological 
automorphism of the Minkowski causal space if and only if it is a horismotical 
automorphism, and hence, if and only if it belongs to G.'? The remaining four 
lemmas say that, if géG, it maps null straights onto null straights (2), 
preserving parallelism between null straights (3) and equivalence between null 
segments (5), and that the restriction of g to a null straight is an affine mapping 
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(4). Turning now to the proof itself, let us recall that the affine space 
<M, R*, H > is made into a vector space by picking out a point O¢.# and 
stipulating that the mapping of R* onto .@ by u++H(u,O) is a linear 
isomorphism. Let <.#, O> denote this vector space and write uO for H(u, O). 
In ¢.4@, O) scalar multiplication and vector addition are defined foranyke R 
and any worldpoints P = uO and Q = vO by the equations kP = kuO and 
P+Q=(u+v)O. A permutation f:. # > is an affine transformation of 
<M, R*, H ifand only if it is a linear mapping of <.4, O > onto <.4, f(O)>. 
Let g €G. Choose four worldpoints P,, P,, P;,and P,, such that P; = u,O for 
four linearly independent null vectors u,(i = 1, 2, 3, 4). Any X €.@ is equal, in 
<«M,O), to a linear combination £,x;P;. The mapping h: X + 2,x,g(P;) is 
clearly a linear mapping of <.#,O) onto <.@, h(O)>, and hence an affine 
transformation of <.#,R*,H >. We shall show that g = h. Let M, be the 
subspace of <.@,O}> spanned by the worldpoints P(l <j<k <4). If 
XéEM,, X =x,P, for some x, ER. By our choice of P,, M, isa null straight 
through O, and by Lemma 4, g(X) = g(x, P;) = x,g(P,) = h(X). Hence, in 
particular, g(O) = h(O) and <¢.4,h(O)> = <.4,g(O)>. Let XEM, and 
assume that g =h on M,_,. Then, X = Y+x,P, for some Ye M,-_,. The 
segment Y X is equivalent to the segment joining O to x, P,. By our choice of P, 
the latter segment is null, so that the former is null too, and they both lie on 
parallel null straights. Hence, by Lemma 5, g(Y)—g(X) = g(O)— g(x, P,). 
Recalling that h is linear, that by our inductive assumption g(Y) = h(Y), and 
that by Lemma 4 g is linear on the subspace of <.4, O» spanned by P,, we 
infer that, in <4,g(0)>, g(X)= 9(V)+9(%,P,) = 91) + x9 (Pi) 
= h(Y)+h(x,P,) = h(Y + x,P,) = h(X). (No summation over repeated indices!) 
Hence, g = honall .@ and is therefore an affine transformation of Minkowski 
spacetime.'? 

The proof of Zeeman’s Theorem does not depend on any topological 
assumptions. The Theorem does imply indeed that the causal automorphisms 
of Minkowski spacetime are homeomorphisms of the standard manifold 
topology of Minkowski’s world .# (the topology of R*). However, though this 
topology is identical with the Alexandrov topology defined by the causal 
structure, Zeeman contends that it is “wrong” to attribute it to Minkowski 
spacetime, because it does not mark any distinction between spacelike and 
timelike curves and admits many homeomorphisms of no physical signifi- 
cance.!* Zeeman (1967) proposes a different topology for Minkowski 
spacetime, which we may call the Zeeman topology. It is defined as the 
strongest topology that induces the 1-dimensional Euclidean topology (the 
topology of R) on each timelike straight and the 3-dimensional Euclidean 
topology (the topology of R*) on each 3-flat generated by spacelike vectors.'* 
The manifold topology of .@ also has these properties and is therefore 
weaker—in fact, strictly weaker—than the Zeeman topology. Zeeman proves 
that the group of homeomorphisms of his topology is precisely the full group 
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of similarities of the Minkowski geometry (i.e. the group generated by the 
causal automorphisms and time reversal). The Zeeman topology has some 
curious properties. It induces the discrete topology on each null straight (i.e. 
the topology in which each singleton is an open set). Moreover, the range of 
any chronological curve, continuous in the Zeeman topology, consists of a 
series of straight timelike segments, like the worldline of a particle that moves 
freely between collisions. Hence, the worldline of a particle accelerated by a 
field of force cannot be the range of a continuous curve in a spacetime endowed 
with the Zeeman topology —a limitation that greatly impairs the latter’s 
physical significance.'® S. W. Hawking, A. R. King and P. J. McCarthy (1976) 
criticize the Zeeman topology on this and other grounds and propose another 
topology for causal spaces endowed with any standard 4-manifold topology of 
their own. (This includes all causal spaces of interest for General Relativity.) 
They call it the path topology and define it as the strongest topology that agrees 
on every continuous chronological curve with the topology induced in it by the 
standard 4-manifold topology of the space.!’? D. Malament (1977c) has shown 
that any permutation of such a space which preserves the path topology is also 
a causal automorphism, so that the causal structure is, so to speak, “encoded” 
in the path topology. One may object to such proposals that the new-fangled 
topologies put forward actually presuppose the standard manifold topology 
they are meant to supplant. Thus, the Zeeman topology is defined in terms of 
flats of the Minkowski affine metric structure, and the latter, as we know, 
generates linear isomorphisms of R* onto the set of worldpoints, which 
naturally induce on it the R* topology (page 92). Philosophical realists may 
argue that the concepts involved in such a definition are what enable us to 
grasp the defined notion from a position of ignorance, but need not belong to 
the essential nature of the real entity which the said notion purports to 
represent. However, it seems to me that a distinction and opposition between 
the order and connection of ideas and the order and connection of things is 
particularly out of place when we are dealing with mathematical structures. If 
we postulate the existence of an object of this kind we commit ourselves eo ipso 
to accepting the reality of whatever is required to make it intelligible. 


CHAPTER 5 


Einstein’s Quest for a Theory of Gravitation 


5.1 Gravitation and Relativity 


At the turn of the century Newton’s theory of gravitation was generally 
regarded as the most successful theory of physics. There were indeed a few 
unsolved anomalies, i.e. discrepancies between some observed quantities and 
their predicted values, the best known being the secular acceleration of the 
mean motion of the Moon and the secular advance of Mercury’s perihelion.' 
Slight modifications of Newton’s Law (1.7.1) had been suggested every now 
and then to explain the anomalies away, but the majority of physicists were 
confident that they would eventually be solved, without tampering with the 
Law’s marvelous simplicity, by taking into account circumstances previously 
ignored or as yet unknown. Einstein expressed this general feeling when he 
said, at the 1913 Congress of Natural Scientists in Vienna, that 


[Newton's] laws have revealed themselves to be so exactly right that, from the standpoint of 
experience, there is no decisive ground for doubting their strict validity.” 


A stronger motive for revising Newton’s theory was the uneasiness felt by the 
less accommodating scientists and philosophers on account of its inherent 
notion of instantaneous action at a distance. A change in the position of any 
body relative to others immediately changes, according to Newton’s Law, the 
gravitational pull experienced by the latter. The gravitational potential (p. 33) 
adjusts itself everywhere without delay to the slightest alteration in the density 
of matter at any point. The unlikelihood of this manner of action, so utterly at 
variance with our ordinary experience, was further enhanced when Hertz 
verified Maxwell’s claim that the seemingly instantaneous effects of elec- 
tromagnetic action propagated with finite speed. As Einstein told the 1913 
Congress: 


After the untenability of the theory of action at a distance had thus been proved in the 
domain of electrodynamics, confidence in the correctness of Newton's action-at-a-distance 
theory of gravitation was shaken. One had to believe that Newton’s Law of Gravity could 
not embrace the phenomena of gravitation in their entirety, any more than Coulomb’s Law 
of electrostatics and magnetostatics embraced the totality of electromagnetic processes.* 


A desire to develop “a theory of gravity resembling the electromagnetic 
theories as much as possible” and to show “that gravity can be attributed to 
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actions that do not propagate faster than light”, was the driving force behind 
Lorentz’s “Considerations on Gravity” (1900).* He assumed that all ponder- 
able matter consists of positively and negatively charged particles (electrons), 
and that a density of positive charge p moving with local velocity v generates 
the electromagnetic field (d,h), according to equations (2.2.1), on page 43, 
whereas a density of negative charge p’ moving with local velocity v’ generates 
the electromagnetic field (d’,h’), according to the equations obtained by 
priming the relevant variables in (2.2.1). He further postulated that (d, h) exerts 
the Lorentz force k, = ae(4nc?d + u x h) ona particle of positive charge e and 
velocity u, and the Lorentz force k, = fe'(4nc7d +’ xh) on a particle of 
negative charge e’ and velocity u’ (where « and £ are two constants such that a is 
slightly smaller than f), while (d’, h’) exerts the forces k, = Be(4nc7d’ + u x h’) 
on the former and k, = ae’ (4nc?d’ + w’ xh’). Ina first approximation we may 
regard ordinary, electrically neutral bodies, as consisting of coupled and 
mutually interpenetrating positive and negative electrons, so that p’ = p, 
everywhere. If we assume moreover that the local velocity of positively and 
negatively charged matter is the same it follows that d’ = —d and h’ = —h. 
Two test particles with positive charge e and negative charge e’ = —e, 
moving with the same velocity u, will then experience, respectively, the forces 
k, +k, = k,(1—8/a) and k,+k, = k,(1 —B/a). Since the force on each 
particle is the same, there is, under the circumstances, no electric field, but there 
is a resultant force on the two-particle system equal to 2k, (1 — B/a). Lorentz 
expected that this force would suffice to explain the phenomena of gravity. It 
agrees with the Newtonian force for masses at rest. But if a body A moves with 
varying speed w about another body B, that moves inertially with speed v, the 
Newtonian attraction between A and B must be supplemented, in Lorentz’s 
theory, with four force terms depending on v or w, of the second order 
relatively to “the very small quantities” v/c and w/c.° The assumption that 
gravitational action propagates with the speed of light c does not therefore 
clash with experience — as it would if the supplementary force terms were of 
the first order in v/c.© Lorentz calculated the precession of Mercury’s 
perihelion according to his theory. The value he obtained was very close to the 
Newtonian prediction and too small to account for the anomaly.’ 

The old wish to get rid of instantaneous action at a distance became an 
urgent need with the advent of Special Relativity. As we know, the principles of 
this theory imply that information cannot travel faster than light, and hence, 
that nothing can react instantaneously, according to a deterministic law, to a 
change in the position of a distant body. The incompatibility of Newtonian 
gravitational theory with Special Relativity can be inferred from the fact that 
the formulae (1.7.1) and (1.7.4) on pages 32 and 33 do not preserve their form 
when subjected to a Lorentz transformation. It is also made evident by the 
following consideration: in a Newtonian two-body system the acceleration of 
each body at a given time depends on a simultaneous position of the other 
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body; in Special Relativity, however, the latter is not univocally defined, but 
depends on the chosen inertial frame of reference—and what is simultaneous 
in one such frame is generally not simultaneous in another. We recall moreover 
that Newtonian gravity is a function of the inertial masses of the interacting 
bodies which, in Special Relativity, are also frame-dependent. Given the 
pervasiveness of gravitational phenomena and the Newtonian theory’s success 
in predicting them, the situation was fraught with difficulty for Special 
Relativity. 

The problem was first noted and tackled by Henri Poincaré, in the final 
section of his paper “Sur la dynamique de l’électron”. Though his motivating 
statement is couched in the language of the Lorentz-Poincaré aether theory 
(Section 3.7), his treatment of the subject is well adjusted to the chronogeomet- 
ric approach proper to Einsteinian Relativity. Indeed, it was for dealing with 
this question that Poincaré devised the mathematical methods subsequently 
taken up and developed by Minkowski (p. 86). Poincaré points out that the 
Lorentz—Poincaré theory would fully account for the impossibility of 
detecting absolute motion if all forces had an electromagnetic origin. But some 
forces, such as gravity, do not have that origin. One must therefore assume that 
every force “is affected by a translation (or, if you prefer, by the Lorentz 
transformation) in the same way as the electromagnetic forces”.? It follows 
that gravitational attraction must depend not only on the positions, but also 
on the velocities, of the interacting bodies. Moreover, while it is natural to 
assume that the force acting on the attracted body A at a given moment 
depends on A’s position and velocity at this moment, it shall depend on the 
position and velocity of the attracting body B at an earlier moment, “as if 
gravitation took some time to propagate”. Also, since “astronomical obser- 
vations do not seem to exhibit any significant deviation from Newton’s law”,'° 
our Lorentz invariant formulae must agree completely with the latter if A and 
Bare both at rest and must differ from it as little as possible if they are moving 
slowly. Choose a Lorentz chart x and let x°, and x} denote, respectively, the 
times, in this chart, at which A suffers and B exerts the gravitational attraction 
that we wish to calculate. Let x%, be the space coordinates of A at time x9 and 
x% those of B at x. We further denote by U, and U,, the 4-velocities of A (at 
the worldpoint mapped by x on x 4) and B (at the worldpoint mapped by x on 
Xg), and by F the gravitational 4-force exerted by B on A (as each goes through 
the said worldpoints). U‘,, U‘; and F' are the components of these 4-vectors 
relative to x. We let R‘ stand for the coordinate difference x‘, — x',. The R‘ are 
plainly the components, relative to x, of a 4-vector which we shall denote by R. 
Poincaré seeks for Lorentz invariant laws stating the dependence of the time 
lag R° on the position and velocity components R%, U%, and U4, and of the 4- 
force F on R, U, and Uz, For the former he proposes two alternatives: either 
<R,R> =n, R'R/ = 0, or (R, Ug> = n;;R'US = 0. In the latter case, how- 
ever, R° might turn out to be positive, implying that x}, > x%,a possibility that 
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Poincaré will not countenance, for “if one conceives that the effect of 
gravitation requires some time for propagating, it would be harder to 
understand how that effect could depend on a position not yet attained by the 
attracting body”.'’ He assumes therefore that R° equals the negative square 
root of R*R%, so that gravitation propagates in vacuo with the speed of light 
(equal to 1 in our—and in Poincaré’s—units).!* Asa 4-vector, F will depend on 
R, U, and Ug linearly, or not at all. We must therefore have F = «R + BU, 
+ yU g, where a, 8, and y are scalar functions of R, U ,and Ug. Tacitly assuming 
that the Newtonian gravitational constant times the “gravity charges” (the 
gravitational masses) of A and B equals 1, Poincaré concludes that the simplest 
solution that converges to the Newtonian Law as A and B approach rest is 
obtained by setting B = 0, y = (R,U,>/a<¢U,, U,> and a= <R, U,>~*. 
Consequently, in the special case considered:!? 


_ -3 < R, U, > 
F = <R, Ug) (R ZU, Us) Us). (5.1.1) 
Poincaré points out that other, less simple solutions will also meet the 
prescribed requirements. He is well aware that the discussion has been sketchy 
and tentative, but he believes that it would be premature to carry it any further. 

Two points are worth noting before we turn to Einstein’s work on gravity. In 
the first place, Poincaré’s force F is not the gradient of a potential. Poincaré’s 
theory, if true, would explain gravitational phenomena by direct-—though 
delayed-—action-at-a-distance of one body on another, without mediation of 
an intervening field. Only in the Newtonian limit does it become equivalent to 
a field theory, based on Poisson’s equation (1.7.4). Einstein’s aim, on the other 
hand, was always a gravitational field theory, governed by differential 
equations that in the Newtonian limit converge to Poisson’s, and thus, through 
the equivalence of the latter with Newton’s Law, simulate, in that ideal limit, 
the magic of direct action-at-a-distance. In the second place, Poincaré, does not 
touch on the question, critical to any relativistic reform of Newton’s theory of 
gravity, regarding the concept, or concepts, of mass (p. 32). In Lorentz 
invariant dynamics the inertial mass depends on velocity. Does the same hold 
for the gravitational mass, i.e. for the quantity, characteristic of each body, that 
determines the magnitude of the gravitational action it is capable of suffering 
or exerting? In the special case considered by Poincaré the gravitational masses 
of the attracting and the attracted body did not have to be mentioned by name. 
Nor did he need to worry about the inertial mass, as he only sought a Lorentz 
invariant expression of the force of gravity, not of the gravitating body’s law of 
motion. !* 


5.2 The Principle of Equivalence. 


At the begining of his article “Zur Dynamik bewegter Systeme”—submitted 
to the Berlin Academy on June 13th, 1907—-Max Planck remarked that a series 
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of ideas and laws which until then had been usually regarded as steadfast and 
almost self-evident presuppositions of all theoretical speculations in physics 
had to be radically revised in the light of recent research, and that some of the 
simplest and most important among them could no longer claim exact validity, 
even if they might still be useful as excellent approximations.' The last of the 
three examples with which Planck illustrates his remark is presented as 
follows: 


The thermic radiation in a perfectly void cavity surrounded by reflecting walls certainly 
possesses inertial mass. Does it also have ponderable mass? If, as it seems most likely, this 
question is to be answered in the negative, the generally assumed identity of inertial and 
ponderable mass, confirmed hitherto by all experiments, is evidently destroyed.” 


A few months later, in the long review article on Relativity com- 
missioned by the editor of the Jahrbuch der Radioaktivitat and delivered 
on December 4th, 1907, Einstein made this identity, that Planck so readily 
called in question, into a matter of principle, which he thenceforth consistently 
treated as the key to the problem of gravitation. In § 11, “On the dependence of 
mass on energy”, after arguing for the thesis that “all inertial mass is to be 
thought of as a store of energy”,’ he discusses the possibility—already 
suggested at the end of Einstein (1905e}—of comparing the mass loss suffered 
by a radioactive substance with the amount of energy radiated by it. Since the 
mass loss can only be measured by weighing the substance, the experiment 
“tacitly presupposes [...] that the inertia and the weight of a system are 
exactly proportional under all circumstances. Hence we must also assume, for 
instance, that the radiation enclosed in a void cavity not only possesses inertia, 
but weight as well”.* After recalling that the proportionality between inertial 
and gravitational mass holds—with the hitherto achieved experimental 
accuracy—for all bodies without exception,” Einstein adds that in the final 
section of the paper he will propose a new argument in support of the 
preceding assumption. The argument rests on an alleged extension of the 
Relativity Principle, presented in §17, which I shall now quote in full. 


Until now we have applied the Principle of Relativity, ie. the assumption that natural laws 
are independent of the state of motion of the reference frame, only to unaccelerated frames. 
Is it conceivable that the Principle of Relativity also holds for frames that are accelerated 
relative to each other? 

This is not the place for a thoroughgoing discussion of this question. But since it cannot be 
avoided by anyone who has followed the applications of the Relativity Principle to this 
point, I shall here take a stance to the question. 

We consider two moving systems 2, and Z,. Let 2, be accelerated in the direction of its 
x-axis, and let + be the (constant) magnitude of its acceleration. Let 2, be at rest ina 
homogeneous gravitational field which imparts to all objects the acceleration —y in the 
direction of the x-axis. 

As far as we know, the physical laws referred to Z, do not differ from the laws referred to 
2; this is due to the fact that all bodies undergo the same acceleration in the gravitational 
field. Hence, in the present state of our experience, we have no inducement [Anlass] to 
assume that the systems 2, and 2, differ in any respect. We agree therefore to assume from 
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now on the complete physical equivalence of a gravitational field and the corresponding 
acceleration of the reference frame. 

This assumption extends the Principle of Relativity to the case of the uniformly 
accelerated translation of the reference system. The heuristic value of the assumption lies in 
the fact that it enables us to substitute a uniformly accelerated reference frame—which can, 
up to a point, be dealt with theoretically—for any homogeneous gravitational field.® 


The assumption introduced here as an extension of the Relativity Principle 
to uniformly accelerated frames will later be called by Einstein the Principle of 
Equivalence. We note that the only empirical evidence adduced for it consists 
of the fact “that all bodies undergo the same acceleration in the gravitational 
field”, or, in other words, that the inertial and the gravitational mass are 
proportional. The statement that they are always exactly so has been dubbed 
by R. H. Dicke the Weak Equivalence Principle. It implies that ordinary 
mechanical experiments with ponderable bodies will follow the same course in 
a uniformly accelerated laboratory and in a laboratory at rest in a matching 
homogeneous gravitational field. But Einstein, encouraged perhaps by the 
analogy with his RP—normally understood as extending Newtonian relativity 
from mechanics to the whole of physics—did not hesitate to proclaim that two 
such laboratories are completely equivalent, so that it is impossible to 
distinguish between them by means of physical experiments of any kind. Dicke 
calls this statement the Strong Equivalence Principle.”? No analogue of the 
Michelson—Morley experiment was at hand to support the extension of the 
Equivalence Principle to radiation,® but, as we have seen, Einstein was content 
to point out that there was no evidence against it. In the same paper he inferred 
from the Strong Principle two effects of gravitation on radiant energy, which 
have been verified later, though with nothing like the precision attained in the 
verification of the Weak Principle: the bending of light-rays in a strong 
gravitational field and the frequency shift suffered by radiation as it goes up or 
down a gravitational gradient.’ These effects can count, to some extent, as 
evidence for the Strong Principle. However, the most powerful inducement for 
embracing it is that it explains the equality of inertial and gravitational mass, 
which the Weak Principle merely asserts as a fact: if the systems Z, and 2, are 
physically equivalent, the same property of a body that manifests itself as 
weight-——a propensity to fall—in 2, must show up as inertia—a propensity to 
lag behind—in £,.'° 

While Einstein’s description of his strong Principle of Equivalence (here- 
after EP) as “a natural extrapolation of one of the most universal empirical 
statements of physics”! is not unjustified, his original presentation of it as a 
generalization of the Relativity Principle is completely misleading. The 
physical equivalence between reference frames at rest in a homogeneous 
gravitational field and uniformly accelerated frames does not obliterate the 
physical inequivalence between the latter and inertial frames. It simply entails 
that a reference frame at rest in a gravitational field cannot be inertial. Thus, 
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for instance, a light-ray emitted in 2, or Z, parallel to the y-axis does not keep 
this direction but inclines towards the decreasing x’s; a light-pulse sent by a 
source at rest in £, or &, in the direction of increasing x (ie. up the 
gravitational potential slope in Z,) will appear redder in the eyes of an observer 
placed in either system, the farther he lies from the source.'? Indeed, if the 
system 2,, moving relative to the inertial frames with constant acceleration y, is 
equivalent to the system &,, at rest in a homogeneous field G in which all 
bodies gravitate with acceleration —y,a frame immersed in G can behave as if 
it were inertial only if it is falling freely. Just as the inertial force that apparently 
acts on bodies referred to X, will disappear as soon as we refer them to an 
inertial frame, the gravitational pull they seem to suffer in Z, can be made to 
vanish by referring them to a freely falling frame. The force of gravity appears 
thus as a kinematic—or chronogeometric—effect, that can be dispelled by a 
suitable coordinate transformation. The real gravitational field is of course 
inhomogeneous and none of the above need apply to it. However, one can 
always choose, around any worldpoint Q, a spacetime region Ug in which the 
gravitational field is homogeneous to some agreed approximation. Let A bea 
body falling freely and without rotation, on whose worldpath lies Q. A chart x 
defined on Ug, according to the standard rules for the construction of Lorentz 
charts (p. 54), by means of natural clocks and unstressed rods at rest in A, is 
called a local Lorentz chart at Q. The EP, as understood by Einstein, implies 
that no effect of gravity will be disclosed—within the agreed margin of 
precision—by any description of natural phenomena in terms of x; and that 
the laws of nature take the same form in Ug, when referred to x, as they would 
when referred to an ordinary Lorentz chart in a spacetime region where gravity 
is absent.!* Since gravity-free regions, as far as we can tell, are confined to 
Utopia, the EP implies that Special Relativity can in effect hold good only in 
the domains of local Lorentz charts, within the margin of accuracy to which 
the gravitational field is homogeneous there. Far from generalizing the 
Principle of Relativity, the EP drastically alters its meaning and restricts its 
scope. For let y be a local Lorentz chart at a point Q’, distinct from Q. Then 
x:y”! is generally not a Lorentz coordinate transformation because the 
domains of x and y do not usually overlap, and, even if they do, their respective 
frames generally move with non-constant velocity past each other. On the 
other hand, x and y can always be so chosen that their ranges coincide, at least 
in part. The mapping x‘ - y—or a suitable restriction of it—is then indeed a 
bijection of a neighbourhood of Q’ onto a neighbourhood of Q, which we may 
call a local Lorentz point transformation, and use for comparing experiments 
performed near Q and Q’. Conjoined with the EP, the RP implies that two 
experiments whose initial conditions read alike in terms of x and y will also 
have the same outcome in terms of these charts. (The modified RP can be 
expressed intuitively by rewriting the Galilei text on page 31 as a comparison 
between two non-rotating skylabs in orbit around any two celestial bodies, 
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instead of two ship-cabins respectively at rest and in uniform motion; needless 
to say, all the empirical evidence for the RP is local and supports precisely its 
modified version.) 

The domains of local Lorentz charts may cover indeed all spacetime, and 
thus the natural laws that hold in each can be said to hold everywhere. But see 
with what limitations: as referred to such local charts, the laws do not 
encompass the universe, but only some tiny, ephemerous fragment of it. The 
aggregate of such fragments, in each of which Special Relativity is ap- 
proximately valid, can only be thought of as a cosmos in the light of another, 
more powerful and truly universal theory. This theory would have to weld into 
a single, continuous, though highly inhomogeneous gravitational field the 
mosaic of nearly homogeneous fields into which spacetime can be broken 
down. Einstein’s quest for such a theory lasted eight years, until November 
1915. But it probably did not begin in earnest before 1912, when he publicly 
acknowledged that the Theory of Relativity, in the original form, was valid 
only in “the important limiting case of spatiotemporal becoming in the 
presence of a constant gravitational potential”—ie., in the absence of 
gravity-—while its new version, enriched by the EP, could only be applied to 


“infinitesimal spaces”.'* 


5.3 Gravitation and Geometry, circa 1912 


Einstein returned to the subject of gravitation in 1911, in a paper 
“Concerning the influence of gravity on the propagation of light”. He no 
longer presents here the Equivalence Principle as an extended Principle of 
Relativity. In §1 he argues that the EP was already part and parcel of 
Newtonian dynamics, though nobody had acknowledged its role in the 
foundations of natural philosophy.! Einstein grants that we are assured of the 
principle solely in its weak form, but contends that it attains a deeper 
significance only in the strong version, i.e., as the statement that “the laws of 
nature referred to [a coordinate system at rest in a homogeneous gravitational 
field with gravitational acceleration — y] and those referred to [a coordinate 
system moving with uniform acceleration y in the absence of gravity] are 
perfectly coincident”.? In §2 he shows by means of one of his clever thought- 
experiments that radiation must gravitate if the weight of a body is 
proportional to its inertial mass, and, in agreement with Special Relativity, the 
latter increases when the body absorbs radiant energy. For if radiation were 
not affected by gravity and could be lifted at no cost, say, from the ground level 
A toa point B at a higher gravitational potential, we would be able to create 
energy by radiating it from A to B, and returning it from B to A stored ina 
falling body. After a full cycle the energy E initially radiated would increase to 
E(1+yhc~*) where y is the magnitude of the gravitational acceleration, h the 
height of B above A and c (= 1, in our units) is the speed of light. The 
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conservation of energy requires therefore that the energy absorbed at B be less 
than that radiated from A. In § 3 Einstein uses the EP to prove, in the manner 
sketched in note 12 to Section 5.2, that radiation must also lose frequency as it 
travels from A to B.? As in 1907, he interprets this to imply that gravity has a 
chronometric significance. Radiant energy can be transferred from one place 
to another by an essentially stationary periodic process. The number of 
periods per unit of time can then be less at the receiving point than at the point 
of emission only if the time unit itself is longer at the latter than at the former. 
If we assume, as usual, that our equations predict the measurements that will 
be obtained with relativistic clocks obeying the clock hypothesis (p. 96), we 
must conclude that such clocks slow down as the gravitational potential 
decreases. Gravitation must therefore be linked to proper time—the quantity 
measured by relativistic clocks. As the speed of light in vacuo is supposed to be 
constant when measured with clocks free from the influence of gravity, 
Einstein infers that when measured with local clocks it must also vary with the 
gravitational potential. If c, denotes the speed of light, thus measured, at a 
point where the gravitational potential is ~, we must have that 


Co = Co(1 + pcg”) (5.3.1) 


Hence, “according to this theory, the Principle of the Constancy of the Speed 
of Light does not hold in the formulation in which it usually underlies the 
ordinary [gewéhnliche] Theory of Relativity”.* Einstein postulates, without 
adducing further evidence, that the equations derived from the EP for 
homogeneous gravitational fields also hold for inhomogeneous fields.> This 
enables him to calculate the redshift suffered by light coming to the Earth from 
the surface of the Sun as 2 Hz per million,® and the deviation toward the Sun of 
a light-ray grazing it as 0.83”.” 

Except for the latter—erroneous—prediction, Einstein’s paper of 1911 does 
not add much to what he had already said on the question of gravity in the 
review article of 1907, but he was now much more successful in drawing to it 
the attention of his fellow scientists. The year 1912 witnessed a veritable 
flowering of relativistic and post-relativistic theories of gravitation. Max 
Abraham incorporated Einstein’s new conception of the dependence of the 
speed of light on gravity into a full-fledged field theory of gravitation, which 
used the Minkowskian idiom to convey a non-relativistic message.* Ginnar 
Nordstré6m proposed a different, no less comprehensive theory, based on 
inconditional adherence to the constancy of the speed of light.? Gustav Mie, in 
a long three-part article on the “Foundations of a Theory of Matter” put 
forward a strictly relativistic field theory which supposedly accounted for both 
electromagnetism and gravitation and was expected to throw light on the 
unsolved problems of microphysics.'° Einstein himself published two papers 
dealing only with static gravitational fields, in which—as in Abraham’s 
theory—the local speed of light is a measure of the potential.'' But he must 
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have continued to ponder over the subject in a broader, more realistic 
perspective, for in the following year he published, in collaboration with 
Marcel Grossmann, the outline of a radically new approach to gravity, 
together with a general theory, in partial agreement with it.!? Though the new 
approach did not come to fruition until almost 3 years later, when Einstein 
discovered the field equations of General Relativity in November 1915, the 
Einstein—Grossmann paper of 1913 already subscribes to the following closely 
interrelated propositions, that set it apart from all earlier contributions to the 
subject. 


A. It is assumed a priori that spacetime is a 4-manifold endowed with a 
Riemannian metric of signature — 2 (i.e. a Lorentz metric, as defined 
on page 93) but the further determination of this metric is left to 
experience. Indeed, the assumption that it has the same signature as 
the Minkowski metric 4 is itself empirically motivated, insofar as it 
entails that every tangent space of the manifold is isometric with 
Minkowski spacetime, and thus accounts for the local success of 
Special Relativity. 

B. The gravitational potential is represented by a symmetric, nowhere 
zero, (0, 2)-tensor field on the spacetime manifold. We denote it by g. 
Observe that, by definition, g is a Riemannian metric (pp. 93, 100). 
Naturally, g must depend on the actual distribution of matter. 

C. The tensor field g is precisely the spacetime metric referred to under A. 
Natural clocks measure the length, determined by g, of their respective 
worldlines (p. 94, eqn. 4.2.5). Moreover, it is postulated that a freely 
falling particle describes a timelike Riemannian geodesic of the metric 
g, and that the worldline of a light-pulse is everywhere tangent to a 
null vector.'? (This, in turn, implies that light-pulses describe null 
geodesics.)'* 


Before trying to ascertain the reasons that led Einstein to these momentous 
innovations, it will be useful to comment on the underlying spacetime 
geometry and the mathematical representation of the potential in the rival 
theories of gravity mentioned above. 

Abraham’s geometry is perplexing. He represents all physical magnitudes by 
4-tensors and 4-tensor fields on spacetime and states the laws of nature as 
equations involving the components of these geometrical objects relative to a 
typical chart with space coordinates x', x? and x°, and time coordinate 
x* = ict, where c is the speed of light and t a chronometric function which 
Abraham wastes no time in defining. The gravitational potential ® is a scalar 
field satisfying the equation 


o7® 
ox'dx! 


= 4nyv (5.3.2) 
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(where y is the gravitational constant, v is the “density of energy at rest”, and 
the index i ranges over {1, 2, 3, 4}). The 4-force F exerted by the gravitational 
field on a particle of unit mass has components 
o® 
Oxi 


Fi= (5.3.3) 
It follows at once that the “inner product” F'dx!/dt of the 4-force by the 
particle’s 4-velocity equals —d@®/dt, and must therefore be different from 0 if, 
as one would normally assume, the potential varies along the particle’s 
worldline. Abraham unquestioningly equates the 4-force F with the particle’s 
4-acceleration A (components: d?x!/dt?) and naturally concludes that the 
speed of light c cannot be a universal constant but must satisfy the equation 
cdc = d®. Hence, the chart employed is certainly not a Lorentz chart. 
Abraham gives the line element squared as ds? = dx*dx*—c*dt?.'5 
Consequently, his spacetime is indeed a Riemannian 4-manifold with a 
Lorentz metric, but the metric is not Minkowski’s.'®© We may well ask what do 
Abraham’s 4-tensor fields, their magnitudes and components mean? For, as we 
noted on p. 106, such geometrical objects only take components relative to 
Lorentz charts. And what is the sense of the differential operators that he 
allows to act on them? For he can hardly expect to obtain any physically 
significant quantities by differentiating tensor components with respect to the 
coordinates on an arbitrary Riemannian manifold. At first Abraham claimed 
that his charts were mutually related by Lorentz transformations in each 
infinitesimal region,'’ but Einstein had no difficulty in proving that he was 
mistaken.'* Abraham accepted Einstein’s argument but emphatically denied 
the implication that his own “conception of time and space [was] untenable 
even from a purely formal mathematical point of view”.!? According to him, 
Einstein had merely proved the untenability of “every relativistic conception of 
spacetime”; but he, Abraham, was willing to return to an absolutist spacetime 
theory and to single out the reference frame in which the gravitational field is 
static or quasi-static as the one in which bodies rest or move “absolutely”.?° 
Soon thereafter he brazenly announced that he preferred “to develop the new 
theory of gravitation without going into the space-time problem”.”' Yet he 
continued to express the laws of nature in the chronogeometrical idiom he had 
learnt from Minkowski and Sommerfeld. No wonder Einstein felt that 
Abraham's theory was “the monstruous child of perplexity”.?? 

Gunnar Nordstrom (1912a) accepts equations (5.3.2) and (5.3.3), but aptly 
observes that they can be asserted together with the constancy of the speed of 
light c, if, as he will assume, mass is a function of the gravitational potential ®. 
If mis the rest mass of a particle moving with 4-velocity U across the potential 
®, the equations of motion will be: 


m——=m + Ui (5.3.4) 
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We have that U'U'= —c? and U'd@/éx' = d@/dt. If c is constant, 
U'dU'/dt = 0. Multiplying both sides of (5.3.4) by U' and summing over i, we 
obtain: 


(5.3.5) 


Hence, m = my exp (®/c?), where mp denotes the rest mass of the particle 
at potential ®=0.?> Another innovation subsequently introduced by 
Nordstrom was to regard the gravitational factor y in (5.3.2) not as a constant, 
but as a function of (i) the internal state of the gravitating body, and (ii) the 
gravitational potential.2+ By assuming (i) one is able to define mass with 
considerable freedom. Thus Nordstr6m (1913) proposes to equate v, the 
“density of energy at rest”, in the source term of equation (5.3.2), with the trace 
T of the components, relative to the chosen chart, of Laue’s energy tensor 
(p. 120).2° This stipulation, plus the hypothesis (ii) that y depends on 9, will 
ensure, according to Nordstrém, that the Weak Equivalence Principle holds, 
at least in an important family of cases.?° 

Contrary to what one might expect, the universal constancy of c in 
Nordstrom’s theory does not guarantee that the charts employed are 
Lorentzian or that spacetime has the Minkowski metric. For, as it turns out, 
the gravitational potential has an effect on rods and clocks that muddles up the 
physical meaning of the coordinates. Nordstr6m squarely faces the issue. After 
proving that according to his theory the rest-length of a body decreases with 
the potential, he remarks that 


since the speed of light is constant, it is clear that the time in which a light-signal propagates 
from one end of a rod to the other increases in the same proportion as the rod’s length. This 
time is therefore inversely proportional to the gravitational potential.?’ 


Hence it is necessary to take the gravitational potential into account in the 
definition of the units of time and length. However, the units defined, say, at the 
potential of the Earth’s surface can be used, according to Nordstrom, 
everywhere, if measurements are performed with telescopes and by exchange 
of light signals, “for light-rays propagate along straight lines with constant 
speed c”.*8 In the last sentence there is an ambiguity that must be dispelled. If 
by velocity we mean coordinate velocity—i.e. the derivative of the position 
coordinates of the moving object with respect to the time coordinate—the 
constancy of c implies that the charts we use will map the worldline of any 
light-ray onto a straight segment in R*, but it does not tell us anything about 
the shape of the worldline itself, unless we already know the chronogeometric 
nature of the charts (e.g. that they are isometries). If, on the other hand, the 
speed of light is conceived—as in Nordstrém’s discussion of the effects of 
gravity on rods and clocks—as a ratio of locally measured distance to local 
time, its constancy may indeed teach us something about the actual shape of 
light-rays in spacetime, but only if those local measurements are related to the 
global metric (e.g. by postulating that a good clock measures the length of its 
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worldline, etc.). In either case we cannot adjudge the shape of light-rays or 
carry out measurements in the manner suggested by Nordstrom unless the 
spacetime metric is defined beforehand. 

Einstein and Fokker (1914) threw light on the geometry underlying 
Nordstrém’s theory a few months after the Einstein-Grossmann theory was 
made public. Their main purpose was to show that Nordstrém’s equations can 
be derived from propositions A, B, and C on page 139, together with a single 
additional hypothesis; namely, that spacetime admits a family of global charts, 
with Cartesian space coordinates, in terms of which the coordinate speed of 
light is a universal constant. Let us call them Nordstr6m charts. For 
simplicity’s sake we set the said constant equal to 1. As on p. 139, we denote the 
spacetime metric and gravitational potential tensor by g. Let g;; be its 
components relative to a Nordstrom chart x. It can then be proved that — goo 
= 911 = 922 = 933, and that y,,=0 if i#j.*° Hence g stands to the 
Minkowski metric 4 in the simple relation g = ©7y, where © is a scalar field 
equal, at each worldpoint, to the local value of \/(—goo).°° Einstein and 
Fokker discuss the mathematically possible and physically plausible relations 
between ® and the energy tensor density by which they represent—as in the 
Einstein-Grossmann theory (p. 158)—the distribution of matter. Their 
argument leads to a field equation equivalent to Nordstrém’s final version of 
(5.3.2), with the trace of the energy tensor density as source term, and 9, of 
course, as the gravitational potential. The identification of Nordstrom’s scalar 
potential ® with the only independent component—relative to a Nordstrom 
chart—of the Einstein-Fokker metric tensor potential g, is consistent with the 
effect of ® on natural clocks and rods and contributes to make it intelligible. 
Since the coordinate differentials dx' of a Nordstr6ém chart x and the line 
element ds satisfy the equations ds? = —@?dx*dx* along any spacelike 
worldline on which x° = const.,and ds? = ©? (dx°)? along any parametric line 
of the time coordinate x°, all “naturally measured times and lengths must be 
multiplied by the factor 1/® in order to yield coordinate times and coordinate 
lengths”?! 

Let us now turn briefly to the third theory of gravity developed in 1912. This 
is part of the broad scheme of Mie’s Theory of Matter, which was expressly 
designed to comply with Einstein’s Relativity Principle, as originally conceived. 
Mie emphasizes that his field equations “are the same in every coordinate 
system that can be reached by means of a Lorentz transformation”.>? We may 
therefore presume—even though Mie does not discuss geometry—that his 
spacetime is Minkowskian and that his coordinate systems are Lorentz charts. 
The gravitational field of a material particle is represented by a 4-vector field 
with timelike part u and spacelike part g, related to a scalar field, the 
gravitational potential w, by the equations 
Ow 0w 


(5.3.6) 
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If the particle has gravitational rest mass m and moves with speed v in the 
direction of increasing x’, the value of w at a point P, reached from the particle 
by the radius vector r, and with which the x! axis makes an angle 9, is given by: 


ym 


where y is a constant and p = |(1 + v? cos? 6 (1 —v”)~')!/?|.39 Einstein showed 
considerably less interest in this theory than in Nordstrom’s or even in 
Abraham’s, because it openly disavows the Equivalence Principle, even in its 

_ weak form.?* In Mie’s theory the gravitational and the inertial mass of a 
particle are equal if the latter is at rest, but not if it is moving. Hence they differ 
also in any body that contains thermal energy. Since the difference will then 
depend on the specific heat of the body, it can in principle be detected by an 
E6tv6s-type experiment. Mie reckons that such an experiment must be 
accurate to one part in 10'' or 10!? in order to test the predicted difference at 
room temperature.°® 


5.4 Departure from Flatness 


Einstein and Grossmann’s “Outline of a Generalized Theory of Relativity 
and a Theory of Gravitation” (1913) sets the stage for physical becoming in a 
Riemannian 4-manifold with a Lorentz metric which is equated to the 
gravitational potential. Since the latter depends on the presence and distri- 
bution of matter, the theory, when completed, should achieve Riemann’s aim 
of explaining geometry by physical interactions.' Einstein learnt about 
Riemann’s work and its formal development by Christoffel, Ricci and Levi- 
Civita, from his friend, the mathematician Marcel Grossmann, after joining 
the faculty of the Swiss Federal Polytechnic in Zurich in August 1912. There is 
evidence, however, that before benefiting from Grossmann’s help he had 
already lighted on the key ideas of the new approach to gravity. We may 
conjecture, therefore, that Grossmann, upon hearing of Einstein’s tentative 
thoughts, late in the Summer of 1912, saw that they could be made precise by 
couching them in the language of Riemannian geometry, and directed his 
friend’s attention to the intellectual resources that this comparatively little 
known branch of mathematics placed at his disposal. 

In this book, the general notion of an n-manifold (Riemann’s “n-fold 
extended continuous manifold”) is explained, in agreement with current 
mathematical usage, in Appendix A. The more specific notion of a Riemannian 
manifold was defined, together with the concepts of Riemannian metric and 
Riemannian geodesic, on pages 93-94. In order to understand Einstein’s new 
approach to gravity we also need Riemann’s concept of curvature. This is 
rigorously constructed in Appendix C. However, it will be useful to give here 
some informal, partly historical indications about it. 

Riemannian curvature extends to n-dimensional manifolds the more 
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intuitive and familiar concept of the Gaussian curvature of a surface. Let S bea 
smooth surface in Euclidean space, diffeomorphic to R’. Let P be any point of 
S. Every tangent to a curve drawn from P on S lies ona plane, the tangent plane 
at P, which we denote by Sp. If S is part of a cone or a cylinder, it can be 
mapped isometrically onto a neighbourhood of P in Sp. But this is not possible 
in general. In order to fit S snugly onto a region of the plane Sp one usually has 
to stretch it or shrink it at one place or another. A surface isometric to a region 
of the plane or pieced together from such surfaces is said to be flat, even if, like 
a cylinder, it looks perfectly round. The Gaussian curvature of our surface S$ 
measures, so to speak, its departure from flatness—in this sense of the word. 
The Gaussian curvature is a smooth real-valued function, i.e. a scalar field, on 
S. We cannot go here into Gauss’s deep but difficult definition of this 
function.” I shall merely give, by way of illustration, one of Gauss’s methods 
for calculating it. The reader is presumably familiar with the concept of the 
curvature of a plane curve.? A plane through the point P and orthogonal to Sp 
intersects S at a curve, called a normal section through P. The curvatures of all 
normal sections through P from a set of real numbers bounded above and 
below. It has therefore an infimum and a supremum, called the principal 
curvatures of S at P. The Gaussian curvature of S at P is equal to the product of 
the two principal curvatures. Hence, it can only be negative if the principal 
curvatures have different signs, i.e. if some of the normal sections through P lie 
to one side and some to to the other side of Sp. Otherwise it is always positive, 
unless one of the normal sections lies on Sp and is therefore straight, in which 
case the Gaussian curvature at P is 0. Observe that, as one would expect, it is 0 
everywhere on a flat surface, and is constant on a sphere; but on arbitrary S— 
e.g. on the inner surface of a sea-shell—it varies smoothly from point to point. 
Gauss showed also how to calculate the Gaussian curvature solely in terms of 
the functions E, F and G, by means of which he specifies the metrical relations 
ona surface “intrinsically”, without regard to the manner how it lies in space.* 
The mere existence of such an “intrinsic” formula for curvature proves Gauss’s 
Theorema Egregium: “If a curved surface is developed upon any other surface 
whatever, the [Gaussian] curvature in each point remains unchanged.”° 
Let us now consider a point P in an n-manifold M with Riemannian metric 
#. P has a neighbourhood U, diffeomorphic to R", such that each Qe U is 
joined to P by a unique geodetic path wholly in U.° We shall define a 
diffeomorphism f of U onto a neighbourhood of the zero vector in the tangent 
space Mp. For each Qe U there is a parametrization y of the geodetic path that 
joins Q to P such that y ts a geodesic, (0) = Pand y(1) = Q. We set f(Q) equal 
to y,p, the vector tangent to y at P. Let v and w be two linearly independent 
vectors in Mp. They span a two-dimensional subspace Mp(v,w) < Mp. The 
intersection of Mp(v,w) with f(U) is the image by f of a two-dimensional 
submanifold of U which we shall denote by S,,,,,). If M,as Riemann assumed, is 
a proper Riemannian manifold (p. 94), S,,,,,, with the metric induced by 4g, is 
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isometric to a smooth surface in Euclidean space and consequently has a 
Gaussian curvature k,,,, at P. We call k,, ,) the sectional curvature of M at P, 
determined by the tangent vectors v and w. If all sectional curvatures of M, at 
every Pe M, have the same value, M is a Riemannian manifold of constant 
curvature.’ If, in particular, that constant value happens to be zero, the 
manifold M and its metric yp are said to be flat. 

The foregoing discussion agrees with Riemann’s treatment of curvature in 
his inaugural lecture of 1854. By 1861 he had found a (0, 4)-tensor field, 
strictly dependent on the metric yz of a Riemannian manifold M, which 
determines and is determined by the sectional curvatures defined above. We 
call it the Riemann tensor, and denote it by R.® It possesses the following 
interesting symmetries: for any vector fields X, Y, Z, W on M, 


R(X, Y, Z, W) = —R(Y,X,Z, W) = —R(X, Y, W,Z) = R(Z, W, X, Y) 
(5.4.1a) 


R(X, Y, Z, W) + R(X, Z, W, Y) + R(X, W, Y, Z) = 0 (5.4. 1b) 


It can be shown that the sectional curvature of M at P, determined by the 
vectors v and w, satisfies the equation: 


kw, w) = &Rp (2, w, v, w) (5.4.2) 


where the factor « depends on the length of v and w and the angle between 
them.’ On the other hand, the symmetries (5.4.1) imply that the sectional 
curvatures at P, determined by every pair of linearly independent vectors in 
Mp, determine the value of the Riemann tensor at P.'° Hence, in particular, M 
is flat if and only if R is everywhere zero. Though we have defined sectional 
curvatures only for proper Riemannian manifolds, whose metric is positive 
definite, the Riemann tensor can be constructed from a Riemannian metric 
also if the latter is indefinite.'’ In this case, we simply use (5.4.2) as our 
definition of sectional curvature, setting « = 1 when v and w are any pair of 
orthogonal unit vectors. The concepts of flatness and constant curvature are 
thereby extended to arbitrary Riemannian manifolds. The components of the 
Riemann tensor R relative to a chart x of the manifold M depend only on the 
first and second derivatives, Op;;/0x", 07p,;/Ox"dx*, of the components of 
the metric p relative to the chart x.'? Consequently, if M has a global chart 
relative to which the components of yp satisfy the relations u;,= +06;,, R is 
identically 0 and M is flat. Hence, according to our definitions, Minkowski 
spacetime is a flat Riemannian manifold. On the other hand, it can be shown that 
a Riemannian 4-manifold with a Lorentz metric is flat only if each point of it 
has a neighbourhood isometric with a region of Minkowski spacetime. A 
manifold with this property is said to be locally Minkowskian. Such locally 
Minkowskian manifolds are, of course, an exception among Riemannian 4- 
manifolds with a Lorentz metric which, in general, do not even have a constant 
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curvature.!? Hereafter any Riemannian 4-manifold with a Lorentz metric (i.e. 
a metric of signature — 2) will be called a relativistic spacetime, or simply a 
spacetime, We refer to its underlying manifold as the spacetime manifold. The 
global topology of a 4-manifold is only minimally specified by the requirement 
that it should admit a Lorentz metric. The spacetime manifold need not be 
homeomorphic to R* and can therefore differ considerably from Minkowski’s 
world .4. For obvious physical reasons we shall always assume, however, that 
it is path-connected, i.e. that any two points of it can be joined by a curve.'* 
When speaking of a given spacetime manifold we shall often denote it by 
(for world). A point of W is a worldpoint, a path in W a worldline. 

Why did Einstein give up the secure definiteness of Minkowski spacetime 
and decided to build the theory of gravity on the moving sands of general 
Riemannian geometry? As we look for an answer to this question we ought to 
recall that Einstein must have almost made up his mind on this matter before 
Grossmann introduced him to Riemann’s theory. Otherwise, he would have 
shown little or no interest in this arduous and, at that time, esoteric subject.'* 
The grounds for Einstein’s decision must therefore have been such as might 
have led him to invent Riemannian geometry, proprio Marte, had it not been 
already available. In the Preface to the Czech edition of his popular exposition 
of Relativity, Einstein wrote: 


The decisive idea [den entscheidenden Gedanken] of the analogy between the mathematical 
problem associated with the theory and Gauss’s theory of surfaces first occurred to me in 
1912, after my return to Zirich, while I still did not know Riemann’s and Ricci’s or Levi- 
Civita’s investigations. '® 


Which was this analogy that Einstein termed entscheidend (literally: “decid- 
ing”, “effecting a decision”)? It must have been striking enough to be perceived 
by someone who lacked a full grasp of one of its terms, and yet sufficiently deep 
to precipitate a revolution in natural philosophy. The parallel I shall now draw 
meets both requirements. And even though Einstein might well not have seen it 
until much later,'” it will help us understand the less straightforward analogies 
that probably struck him first. 

Take a smooth surface S. S can be entirely covered by charts that map a 
neighbourhood of each point diffeomorphically onto a region of the plane, just 
like a topographical survey covers a continent by charting it piecewise onto 
small flat papers. If the neighbourhoods are small enough, the diffeomorph- 
isms can be so chosen that they are nearly isometric, their accuracy improving 
as the Gaussian curvature on their respective domains approaches 0. We may 
for certain purposes ignore the curvature, as we do when we read distances 
with a ruler ona road map. But it cannot be neglected in the long run; thus, the 
curvature of the Earth must certainly be reckoned with in intercontinental 
flights. Now consider spacetime, in the presence of ordinary gravitational 
fields, as we described it, in the light of the EP, on pages 136 f. It can be entirely 
covered by local Lorentz charts that map a small neighbourhood of each 
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worldpoint diffeomorphically onto a region of R*. If R* is endowed with the 
Minkowski inner product (p. 91), the local Lorentz charts are nearly 
isometric, their accuracy improving as the gravitational field in their respective 
domains approaches homogeneity. We can sometimes ignore the inhomo- 
geneities of the field, e.g. in a local test of Special Relativity. But we shail not 
rise above a narrow parochial worldview unless we take them into account. 
Spacetime can be likened to an aggregate of small contiguous Minkowskian 
regions, just as a surface can be approximated by a flat-faced polyhedron; but 
gravity “warps” the former as curvature does the latter. 

Though the foregoing analogy is both deep and striking there is no 
documentary evidence that in 1912 Einstein was thinking along such lines. 
Indeed, the link between gravity and the Riemann tensor, which the above 
comparison with Gaussian curvature obviously suggests, apparently dawned 
on him only in 1914.'° It seems that at first he did not pay much attention to 
local Lorentz charts, perhaps because, for lack of a tangible example such as we 
have in our orbiting spaceships, he would not fancy the idea of referring the 
laws of physics to a freely falling frame. On the other hand, he was much 
impressed by the impossibility of constructing a metrically meaningful 
spacetime chart by means of rods and clocks at rest in a gravitational field. This 
fact, which may have reminded him of the non-isometric parametrizations by 
which curved surfaces were represented in the Gaussian tradition, is a direct 
consequence of the EP: a chart adapted to a uniformly accelerated frame 
cannot be metrically meaningful, for it maps the worldline of a particle at rest 
in the frame onto a straight segment in R‘, though such a particle does not 
describe a geodesic in Minkowski spacetime (p. 96). Einstein however did not 
readily accept this consequence, for—as he remarks in his “Autobiographical 
Notes”—“it is not easy to free oneself from the idea that the coordinates must 
have an immediate metrical meaning”.'® 

There is strong evidence that Einstein, enlightened by his reflections on 
uniformly rotating frames of reference, did free himself from this idea before 
moving to Ziirich in August 1912.?° The behaviour, in Special Relativity, of a 
uniformly rotating rigid body was the subject of conversations and letters 
between Einstein, Born and Sommerfeld after Born proposed his definition of 
rigid motion at a congress in Salzburg in September 1909.7! On September 
29th Einstein wrote to Sommerfeld that the problem interested him on 
account of a possible extension of the Relativity Principle to uniformly 
rotating frames, corresponding to its extension to uniformly accelerated 
frames in the final section of the survey paper of 1907 (cf. p. 134).?* Though 
Einstein did not publish anything on the subject in the next few years, he must 
have pursued it privately, for in the first paper on the static gravitational field, 
submitted on February 26th, 1912, he cites a novel conclusion regarding 
rotating frames in support of an important remark. After postulating that the 
lengths measured with standard rods at rest in a homogeneous gravitational 


148 RELATIVITY AND GEOMETRY 


field or on a uniformly accelerated reference frame obey the laws of Euclidean 
geometry, he observes that this stipulation is not self-evident, and involves 
physical hypotheses that could eventually turn out to be false. Thus, for 
instance, 


in all likelihood, they do not hold in a uniformly rotating system, in which, due to the 
Lorentz contraction, the ratio of the circumference of a circle to its diameter would have to 
differ from 2, if our definition of length is applied.?3 


Einstein did not spell out the grounds for this statement until much later, in 
1916. In §3 of “The Foundations of the General Theory of Relativity”, he 
analyses the geometry of a uniformly rotating plane in order to show that the 
familiar description of spatial relations by means of isometric (Cartesian) 
coordinates defined by means of measuring rods “must be dropped and 
replaced by a more general one [. . .] if the Special Theory of Relativity holds 
in the limit in which there is no gravitational field”.?* Einstein’s analysis can be 
rephrased as follows. We consider a region of the world in which no effects of 
gravity are felt. Let F, be an inertial frame and let the frame F, rotate with 
constant angular velocity @ about a fixed point O in F,. (We may imagine each 
frame F, as consisting of three mutually perpendicular rods, joined at one of 
their endpoints.) We denote by S, the relative space of F,, i.e. the set of spatial 
locations that, relatively to F,, are permanently at rest. As usual, all metrical 
relations in S, are to be ascertained by means of measuring rods at rest relative 
to F,. It is assumed that every rod always agrees with a standard rod at rest in 
the former’s momentary inertial rest frame (mirf). This entails that the rods at 
rest in F, are unaffected by acceleration, or that the effects of the latter—e.g. 
the longitudinal stress on rods issuing at right angles from the axis of 
rotation—have been duly corrected for.?> The geometry set up by such means 
in the relative space S, will be called the standard rod geometry G,. If Special 
Relativity holds good in our gravitation-free region, G, must be Euclidean. In 
that case, however, as we shall now see, G, is not Euclidean. Let C, beacircle in 
S,, centred at O and orthogonal to the axis of rotation of F,. Let the radius of 
C, be R times the length ofa standard rod, where R|«| is less than c, the speed 
of light in F, (expressed in standard rod lengths per time unit), and yet R is so 
large that 2xR standard rods will fit snugly, end to end, on the circumference of 
C,. Let C, be the locus of all points of S, that lie on C, at a given time. By 
reasons of symmetry, C, must be a circle, the same at all times (so that we do not 
need to worry about the definition of time in F,). We shall now calculate the 
ratio of C,’s circumference to its radius in the standard rod geometry G,. A 
standard rod at rest in F, along a radius of C, has zero longitudinal velocity 
relative to F,. Hence the rod’s length in F,—1.e the G,-distance between 
simultaneous positions of its endpoints in S,—is equal to the length of a 
standard rod at rest in F;. Consequently, the number of rods at rest in F, that 
can be placed end to end along a radius of C; is R. Consider nowa standard rod 
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at rest In F, along the circumference of C,. At any time the rod’s mirf moves 
relative to F,, along a tangent to C;, with speed v = R|w|. Hence the rod’s 
length in F, is \/(1 —v?c~?) times the length of a standard rod at rest in Fy. 
How many standard rods at rest in F, can fit snugly, end to end, along the 
circumference of C,? Evidently, as many as can leave their distinct imprints 
simultaneously (in F;-time), end to end, on the circumference of C,, namely, 
2nR/./ (1 —v?c~”). Hence the ratio of the circumference of C, to its radius is 
greater than 21. The standard rod geometry G, is therefore Non-Euclidean. It 
follows that S,—with this geometry—cannot be mapped isometrically onto 
R? or be described by means of Cartesian coordinates.?° 

By the beginning of the Summer of 1912 Einstein was thoroughly convinced 
that physics could no longer rely on the availability of isometric coordinate 
systems and that the laws and observations involving the effects of gravity had 
to be referred to spacetime charts whose precise metrical significance was not 
fixed a priori. In his reply to Abraham (1912d), submitted on July 4th, he 
emphasized that even in the “very special case of the gravitation of masses at 
rest, the spacetime coordinates will forfeit their simple physical interpret- 
ation”.?’ The use of non-isometric coordinate systems was indeed a well- 
known feature of Gauss’s theory of surfaces. Such systems can be defined on 
any surface diffeomorphic to R?.*8 They are unavoidable, however, if the 
surface under consideration is not flat—and hence not isometric with R?. At 
some point in the course of his reflections Einstein must have been struck by 
the analogy between this familiar fact and the situation emerging in the 
relativistic theory of spacetime. Here again arbitrary charts can be introduced 
at will on a neighbourhood of any worldpoint. But if spacetime is 
Minkowskian they can always be replaced by a metrically meaningful Lorentz 
chart. When he wrote the reply to Abraham, Einstein had already understood, 
however, that Lorentz charts cannot be defined—not even by means of freely 
falling rods and clocks—-on the domain of a real gravitational field, that 
changes with place and time.?? Thus gravity would prevent a metrically 
significant charting of the world, much like Gaussian curvature precludes the 
definition of Cartesian coordinates on a surface that is not flat. Though the 
analogy 1s clear enough, it must have seemed amazing, for Einstein did not 
yield to its lure until he found a reason that made it comprehensible. This was 
provided by the hypothesis—included in Proposition C on page 139 that a 
freely falling particle describes a geodesic of the spacetime metric constituted 
by the gravitational potential. In his “Notes on the Origin of the General 
Theory of Relativity” (1934), Einstein explains how this hypothesis enabled 
him at last, in 1912, to see what coordinates really meant in physics.°° I shall 
now try to reconstruct the considerations that probably led him to it.?! 

According to the Law of Inertia a free particle z describes a timelike geodesic 
in Minkowski spacetime (p. 96). Let y be the worldline of 2, parametrized by 
proper time t. y is then “length-critical” on every closed subinterval of its 
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domain (pp. 94-95). This property of y is expressed in the classical notation of 
the Calculus of Variations by the variational equation 


5 fac =0 (5.4.3) 


which says that y’s length, | dt, remains stationary in every variation of y (see 
note 6, on p. 304). In terms ofa Lorentz chart x, the element of proper time drt is 
given by 

dt? = n,;dx'dx! (5.4.4) 


and y satisfies the equations: 


d?xi- y 
a = 9 (5.4.5) 


In terms of an arbitrary chart y, defined on a neighbourhood of z’s worldline, 
dt? = gidy'dy/ (5.4.6a) 


where the coefficients g;; are the components 9(0/dy',d/dy’) of the 
Minkowski metric 9 relative to y. (Note that, according to this definition, the 
right-hand sides of (5.4.4) and (5.4.6a) are equal; we have replaced the Lorentz 
chart by arbitrary coordinates but have retained the same flat metric.) Writing 
g'/ for the components, relative to y, of the contravariant form of 4 (p. 100), we 
introduce the functions?” 


i inf 99ni , O9kn — OG ix 
i _ 1 ih J kh 
ri=dtg ( ae al aa (5.4.6b) 


Equations (5.4.3) and (5.4.6) imply that y also satisfies the system: 
ayy dy y dy*-y 


dt? ar dt dt fe) 


To an observer who, for any reason, views y as metrically significant, the right- 
hand sides of equations (5.4.7) will appear as the components of a 4-force that 
acts on 7 as it traverses the field represented by the Vi. However, the 4-force 
and its supporting field can be made to vanish—like the inertial forces of 
classical mechanics—by a coordinate transformation, namely, x: y ~', where x, 
a Lorentz chart, truly discloses the metric. 

Suppose now that the particle z is falling freely in a real, inhomogeneous 
gravitational field. Its worldline will successively pass through the domains U,, 
U,,... of diverse local Lorentz charts x,, x2, ... On each such domain U,a 
natural clock satisfying the clock hypothesis (p. 96) measures along its 
worldline the proper time | dt of a Minkowski metric on U within the margin 
of precision to which the field is homogeneous on U. But the several flat 
metrics which thus approximately hold good in the domain of each local 


EINSTEIN'S QUEST FOR A THEORY OF GRAVITATION 151 


Lorentz chart cannot be regarded as the restriction to their respective domains 
of a global Minkowski metric, any more than the Euclidean metrics that show 
up, say, in the street plans of Mannheim and Manhattan are the restrictions to 
these boroughs of a Euclidean metric defined on the entire Earth. On each 
domain U the worldline of 2 satisfies, within the said margin of error, the 
variational law (5.4.3); yet the law does not make any sense beyond the 
boundary of U. However, the following conjecture is near at hand. Just as a 
tape-line measures, at any place of the Earth, the line elements of both the local 
Euclidean and the global “geoidal” metric, so a natural clock might measure, at 
any worldpoint, the element of proper time of the local Minkowski metric and 
the global, still to be determined, metric of spacetime. All that is required is that 
the global metric—which we henceforth denote by g—be approximated, ona 
small neighbourhood of each worldpoint, by the local flat metric 4. To ensure 
it we must only demand (i) that g has the Lorentz signature and (ii) that the 
local 4 agrees with g at the origin of the pertinent local Lorentz chart.*? If this 
is allowed, we can postulate further that, if f dt is now made to stand for the 
length of z’s worldline as determined by g—ie. for the proper time measured 
along it by a natural clock at x—the variational principle 


é fa: =0 (5.4.3*) 


is obeyed by the freely falling particle x between any two events in its history. We 
call this postulate the Geodesic Law of Motion, or simply the Geodesic Law. 
Einstein has referred to it as “a natural” extension of the Principle of 
Equivalence to arbitrary gravitational fields,** and also as “the core of the 
‘Hypothesis of Equivalence’”.** But it is certainly not a corollary of the EP, 
as the typographical identity of (5.4.3) and (5.4.3*) might suggest. The line 
element of the local flat metric 4 which occurs in (5.4.3) is not to be confused 
with the line element of the metric g which occurs in (5.4.3*), even if, by 
hypothesis, they are both measured at each worldpoint with the same 
instrument.*° If the g;; and g’! in (5.4.6) denote the components of g—not 7— 
relative to an arbitrary chart y, then, in the domain of y, the worldline y of the 
freely falling x must satisfy the equations: 


(5.4.7*) 


The functions I", appear here in the guise of “gravitational field strengths”, 
derived from the “potentials” g;;. Since in the presence of inhomogeneous 
gravity a coordinate transformation cannot reduce the g;; to the constant 
functions y;; over a finite region, the spacetime metric g is not flat. This 
explains why the coordinate systems of physics, which chart the world into the 
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flat “number manifold” R*, do not allow a direct metrical interpretation. As 
Einstein expressed it in London in 1921: 


The metrically real is now given only through the combination of the space-time coordinates 
with the mathematical quantities which describe the gravitational field.>” 


Though the foregoing ideas could not become perspicuous until Einstein 
had learnt the essentials of Riemannian geometry with Grossmann, they were 
hovering in his mind several months before he joined his friend in Zurich. At 
the end of his second paper on the static gravitational field he observes that, if 
we set H = —m ,/(c? —v?), where mand v are, respectively, the rest mass and 
the speed of a particle, and c denotes the local speed of light (which in that 
work is a function of the gravitational potential—p. 138), the particle’s 
free fall in a static field obeys the variational principle “6{Hdt = 0, or 
6 | /(c?dt? —dx? —dy?—dz?)=0. He then remarks: “The second 
Hamiltonian equation suggests [/asst ahnen] how the equations of motion ofa 
particle in a dynamical gravitational field are built.”2° 


5.5 General Covariance and the Einstein-Grossmann Theory. 


A particular relativistic spacetime may admit a chart or a family of charts 
specially suited to its structure. Thus, the Lorentz charts peculiar to flat 
spacetime are admirably fit for describing it. A Lorentz chart x partitions flat 
spacetime into equidistant hypersurfaces x° = const., each of which is a flat 
spacelike 3-manifold, isometric to Euclidean space. x is an isometry on each 
such hypersurface and also on each parametric line of the time coordinate 
function x°. Moreover, x is a global isometry of spacetime onto R’, if the latter 
is endowed with the Minkowskian—instead of the usual Euclidean—inner 
product. On the other hand, x can only be defined if spacetime is flat. Since 
Special Relativity assumed from the outset that Lorentz charts were readily 
definable in the real world, it naturally assigned them a preferred status and 
represented all physical quantities and expressed all natural laws in terms of 
such charts. But if nothing is known a priori about the actual geometry of 
spacetime and the specific charts that it admits, the representation of physical 
quantities and the formulation of natural laws must be referred, in principle, to 
arbitrary charts. (Unless, of course, one resorts to the chartless or “coordinate- 
free” mode of expression, an alternative that was not yet available in 1913, due 
to lack of clarity regarding the nature of geometrical objects.') 

The new approach to gravity proposed in Einstein and Grossmann’s 
“Outline” of 1913 therefore involves the demand that all physically significant 
quantities and the equations between them be referred indifferently to any 
conceivable spacetime chart. In 1916 Einstein stated this demand as the 
Postulate of General Covariance: 
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The universal laws of nature must be expressed by equations which hold good for all 
coordinate systems, that is, are covariant with respect to arbitrary transformations 
(generally covariant).” 


Compliance with this postulate obviously implies the following broadened 
version of Einstein’s Relativity Principle of 1905: 


The Principle of General Relativity (GRP). The laws by which the 
states of physical systems undergo change are not affected, whether 
these changes of state be referred to one or the other of two arbitrary 
spacetime charts. 


By deliberately imitating the wording of the 1905 RP (p. 54) I hope to make 
clear why Einstein persistently maintained that the new approach to gravity 
involves a generalization of Relativity. This view, embodied in the name 
“General Relativity” by which we still designate the theory of gravity in which 
the new approach finally proved its worth, is certainly legitimate. But we must 
not let it obscure the considerable differences of meaning and motivation 
which separate the first or “special” RP from the GRP proposed above. Though 
the former may seem to be, like the latter, a purely methodological rule for the 
statement of laws, it yields immediately—through the one-one correspon- 
dence between Lorentz coordinate and point transformations—a genuine 
factual principle concerning the observable course of phenomena (p. 65). 
Indeed it was the factual principle, corroborated by an increasing store of data 
that motivated the adoption of the rule of method: the laws of nature must take 
the same form with respect to all Lorentz charts because the inertial frames to 
which such charts are adapted are observationally equivalent. On the other 
hand, the GRP cannot rest on such grounds. In the first place, there is no 
one-one correspondence between arbitrary coordinate transformations and 
point transformations of spacetime. If x and y are two arbitrary charts such 
that x-y~' is defined on some open subset of R*, x~'-y may just map the 
empty set onto itself, for the ranges of x and y need not overlap. (This is in 
sharp contrast with the case of Lorentz charts, all of which share the same 
range—viz., R*—and the same domain—flat spacetime.) In the second place, 
even if x~'- ydoes mapa region of spacetime onto another, this does not entail 
the equivalence of the frames to which x and y are adapted, for there might not 
be any such frames at all. Suppose however that x and y are each linked to 
something that we may reasonably call a frame of reference. Suppose, for 
example, that each chart has a clear-cut time coordinate function. Such is the 
nature of, say, x° if the fibre of x° over each of its values is a spacelike 3- 
manifold and its parametric lines (i.e. the integral curves of 0/dx°) constitute a 
congruence of timelike worldlines in the chart’s domain. Since a congruence of 
timelike worldlines is the geometrical equivalent of a frame (p. 28), we may 
say that our charts x and y are adapted to the frames—i.e. the congruences— 
F, and F, defined by their respective time coordinates. (Since the lines of each 
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congruence need not be equidistant, such reference frames are not rigid—they 
are more like a creeping mollusc than a standing grille.) However, the fact that 
the laws of nature take the same form when referred to x or y does not imply 
that F, and F, are physically equivalent frames. They evidently are not 
equivalent if one of them is and the other is not a congruence of geodesics, or if 
both are congruences of geodesics but F,, covers a strongly curved region of 
spacetime and F, an almost flat one.’ Indeed such results were to be expected, 
given the motivation for the GRP. Acceptance of this principle is based not on 
physical knowledge but on geometrical ignorance; not on experiments 
purporting to show that under equal conditions all physical phenomena have 
the same course of development in any frame of reference, but rather on our 
inability to decide a prtori the structure of spacetime and the metrical 
significance of our coordinate systems.* The 1905 RP can translate a 
fundamental physical fact into a formal requirement concerning the invari- 
ance, under a particular group of coordinate transformations, of laws referred 
to charts of a particular family, because those charts have, by assumption, a 
fixed metrical meaning, and that group is, by definition, a representation of the 
group of motions of the underlying flat spacetime. The arbitrary charts and 
coordinate transformations mentioned in the Postulate of General Co- 
variance do not fulfil such conditions. The spacetime coordinates have now 
“sunk down to the status of meaningless freely choosable auxiliary variables”® 
while the set of coordinate transformations relating them is not the realization 
of a group,® let alone a group representation—the transformations are 
generally non-linear—and its structure betokens only the differentiable 
structure of the spacetime manifold, but has nothing to say about the 
spacetime metric or its group of motions (which, if the Riemann tensor varies 
freely from point to point, is probably none other than the trivial group, 
consisting of the identity alone).’ 

In Sections 4.3 and 4.5 we studied the method used by Minkowski—after 
Poincaré—-for complying with the demands of the RP. By identifying physical 
quantities with suitable 4-tensor fields on spacetime and expressing the laws of 
nature as equations between the field components, he ensured that the laws 
took the same form with respect to any Lorentz chart. We also noted the 
method’s great limitation: by definition, 4-tensor fields only have components 
relative to Lorentz charts, so that the equations of physics, in the Minkowskian 
version, become not just more or less false but meaningless when referred to 
non-inertial frames of reference. Since the EP entails that all laboratory frames 
on earth are non-inertial, it was imperative to find a less restrictive statement of 
the laws of nature. On page 106 I suggested how to achieve it within the 
framework of Special Relativity. All one has to do is to read the differential 
equations that link the components of 4-tensor fields relative to a Lorentz 
chart x, as equations between the components, relative to x, of the correspond- 
ing ordinary tensor fields on flat spacetime. The same equations will hold 
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between the components of the said tensor fields relative to an arbitrary chart, 
provided that we replace the ordinary differential operators which occur in 
their original formulation (referred to the Lorentz chart x) by the correspond- 
ing covariant differential operators.* To explain what this means and how it 
works let me introduce some new notation that we shall be using from now on. 
If T is a (q, r)-tensor field on a relativistic spacetime (W, g) and x is any 
given chart of W, we denote the (f)-th component of T relative to x by Tj 
(where a = (a,,..., @,) and b = (b,,..., b,) are lists of indices drawn from 
{0, 1, 2, 3} ). The covariant derivative VT is a (q, r + 1)-tensor field on W.° We 
write Th, for (VT), the (§,)-th component of VT relative to x (where 
bk stands for the list of r + 1 indices (b,, . . ., b,, k)). We write V,T for Vaja,*T, 
the covariant derivative of T along 0/éx*, which is a (q, r)-tensor field, like T. It 
can be shown that Tf, = (V,T)§.'° We shall represent partial differentiation 
with respect to the k-th coordinate function by a comma followed by k. Thus, 
TB, = OT §/0x*. We denote by a(m/a;) the list obtained by substituting m for 
a; in the list a. We have that 


q r 
The = The + x Peel pe > Poe T bomb (5.5.1) 
i=1 i=1 
where the Ij, are the scalar fields defined by (5.4.6b) in terms of the 
components of the metric g relative to x. If g is the Minkowski metric 7 and x 
happens to be a Lorentz chart, the 'j, are of course identically 0 and 


Te, = Thy (5.5.2) 


Every 4-tensor field T’ is the restriction to Lorentz tetrad fields of an ordinary 
tensor field T on Minkowski spacetime. A functional relation ®'(T,,T4, . . .) 
= 0 between 4-tensor fields is equivalent to a relation ®(T,,T,,...)=0 
between the corresponding ordinary fields. (5.5.2) tells us how to obtain a 
generally covariant formulation of the latter from a Lorentz covariant 
formulation of the former. Let ®’ = 0 be expressed as a system &’ of 
differential equations involving the components T{§, T24,...of T4, 
T,... relative to a Lorentz chart x, and the partial derivatives Tj ,,, 
:b,kj> >» » Of those components with respect to the coordinate functions of x. 
Then, by (5.5.2), the same system &’ expresses ® = 0 relative to x. To obtain a 
system & expressing ® = Qin terms of an arbitrary chart y, replace in 2’ (i) the 
components T; $, T2§,... of T,, T2,... relative to x, by their components 
Ti b» T2q -- - relative to y; (ii) the partial derivatives T{ 5, T1b,x; -- - of the 
former components with respect to the coordinate functions of x, by the 
components T, $x, Tibuj>-- - relative to y, of the covariant derivatives VT,, 
VVT,,... Thus, within the bounds of Special Relativity, general covariance 
can be achieved by “dotting the commas and unpriming the T’s”. 
In the curved spacetime of Einstein and Grossmann’s “Outline” the Lorentz 
covariant equations of Special Relativity hold good approximately with 
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respect to each local Lorentz chart. By the typographical procedure sketched 
above one can obtain from them generally covariant equations with a prima 
facie claim to global validity. Although this case differs significantly from the 
former, Einstein had no qualms about using the same procedure to figure out 
the laws of Relativity physics in the presence of irreducible gravitational fields. 
In 1914 he wrote with Fokker: 


To each quantity and operation of vector analysis on the [flat] manifold there is a 
corresponding more general quantity and operation of vector analysis on the manifold 
characterized by an arbitrary line element. Hence, the laws for physical phenomena of the 
original Theory of Relativity can be set in correspondence with matching laws of the 
generalized Theory of Relativity. The laws thus obtained, which are generally covariant, 
include the influence of the gravitational field on the physical processes.’'! 


The procedure certainly yields the simplest conceivable generalizations of the 
local laws, and it is reasonable to follow it while there is no inducement to 
search for something more elaborate (and hardly any indication of how to find 
it). However, we must bear in mind that its justification depends on a stronger 
hypothesis and a far less straightforward argument if spacetime is curved than 
if it is assumed to be flat. If T is a tensor field on the curved spacetime (¥, g) 
and pT’ ts a 4-tensor field defined in terms of local Lorentz charts near a 
worldpoint P, T cannot be obtained simply by lifting the restriction of pT’ to 
the local Lorentz tetrad fields. For on the one hand we expect T to be defined 
on all spacetime, not just on a narrow neighbourhood of P. And on the other 
hand pT’, as a 4-tensor field, cannot be properly defined even on a small region 
of the curved spacetime (¥, g). To see in what sense pT’ can exist, after all, near 
Pew, let us recall that a neighbourhood U, of P is mapped onto a 
neighbourhood of zero in the tangent space W, by the diffeomorphism f 
described on p. 144. If, as we assume, g is a Lorentz metric W, is a 4- 
manifold endowed with the flat Minkowski metric 4. The metric f*4 on U, is 
also flat!” and the 4-tensor field pT’ can certainly be defined on (Up, f* 4) and 
extended—by “dotting the commas”—to an ordinary tensor field pT on this 
flat manifold. But (Up,f* 4) is not a sub-space of (W; g), for f* 4 does not agree 
with the restriction of g to Up. Moreover, if f: Ug —> Wg is defined like 
f: Up>W, ona suitable neighbourhood Up of a point Qe Up, f*y will not 
generally agree with f* 4 on Up m Ug.'* The metrics g, f* 4 and f* 4 determine 
distinct covariant differential operators on Up > Ug. Consequently, “dotting 
the commas” does not mean the same with respect to the three metrics and 
there is no compelling reason why pT should agree on Up - Ug with a tensor 
field gT constructed by the same method on (Ug, f* n). However, on a 
sufficiently small neighbourhood of P (or Q), g very nearly agrees with f* 4 (or 
f*n).'* One may therefore reasonably expect that if {Up,Ug,...} is a 
covering of (W,g) by sufficiently small neighbourhoods of suitable 
worldpoints P,Q, . . .,a collection of geometrical objects pT, gT, . . ., defined 
on each neighbourhood by equivalent physical methods, following the above 
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procedure, will merge almost smoothly into a single global object T on W” 
(Note that in the definition of T one must ignore—or even up—the small 
disagreements between the differential operators of the local flat metrics that 
go into the definition of pT, gT,..., and the homologous differential 
operators of the global metric g.) Such global objects can be constructed from 
all the tensor fields of the locally valid Special Relativity physics. Einstein and 
Fokker’s assertion, in the passage quoted above, that the laws of the original 
Theory of Relativity can be matched by corresponding laws of the new 
generalized theory amounts to the bold but not implausible hypothesis that 
the generally covariant equations which, on each small neighbourhood, link 
the local fields to one another, hold also in the same form between the 
respective global fields. Some such hypothesis is needed to unify the many 
equivalent yet unrelated Special Relativities in a system of the universe 
(cf. p. 137). 

Though the procedure of “dotting the commas” is automatically justified by 
the said hypothesis, it must be administered with caution, for it may 
occasionally turn equations into inequalities when derivatives of the second or 
higher order are involved. The reason why this can happen is that, although 
6? /éx/Ax* is always equal to 0?/dx*dx/, on curved spacetime V,V, generally 
differs from V,V;.'° Consequently, if T is the generalization to curved 
spacetime of the local tensor field pT, obtained in its turn by the standard 
procedure from the local 4-tensor field pT’, it is possible that Th. # Th.» 
even though pT’?, ;, = pT’},,;- (This and other related difficulties are discussed 
by A. Trautman (1965), pp. 123-7.) 

By judiciously substituting covariant for ordinary derivatives wherever the 
latter occur in the definitions, we can extend to a curved spacetime all the 
physical concepts employed in the four-dimensional formulation of Special 
Relativity. The generalizations of Minkowskian 4-tensors thus obtained will 
be called world-tensors (e.g., world-velocity, world-momentum, world-force, 
etc.). Asan illustration that will be useful later, let us figure out the components 
Aj, relative to a chart x, of the world-acceleration of a particle describing a 
timelike worldline, parametrized by proper time t, in an arbitrary relativistic 
spacetime. By (4.3.3) and (4.3.7), the i-th component of the 4-acceleration 
relative to a suitable local Lorentz chart y is d?y'/dt* = (dy'/dt), dy//de. 
“Dotting the comma” and writing U' for dx‘/dt we have therefore that: 


Ai = Ut,U! 
= (Ui; + ry,U)U? 
_ 6U' dx! 
Oxi dt 
d?x! , Ax* dx! 
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Comparing the above with (5.4.7*) we see at once that the particle describes a 
geodesic if and only if its world-acceleration is always equal to zero. 

According to the Einstein—-Grossmann “Outline” the spacetime metric g 
must depend on the distribution of matter (proposition B, p. 139). Special 
Relativity represented the distribution of matter by means of Laue’s energy 
tensor T (pp. 118 ff.). We must now regard this representation as local and 
construct from it a global tensor field, in the manner indicated above. We shall 
denote this global energy tensor by T also. Like its local counterpart, it is a 
symmetric tensor field of order 2. T represents the total energy of bodies and 
fields, except the gravitational field. The generally covariant law that matches 
the conservation principle (4.5.20) can be stated as: 


Ti, =0 (5.5.4a) 


where the T'/ are the components of T (in its contravariant form), relative to an 
arbitrary spacetime chart x and the four functions T'’.; are the components 
relative to the same chart of a vector field known as the covariant divergence of 
T. Einstein describes (5.5.4a) as “the most universal law of theoretical 
physics”.!® It can be spelt out as follows: 


( J-9 Gil!) ;-3 JV -9 Tigi, .=0 (5.5.46) 


where the g; ; are, as usual, the components relative to x of the metric g, and g 
denotes the determinant of their matrix.’ g,T'/ is equal to T,, the 


({)-th component of the mixed form of T. If we set Hi, = 39'"g,;,, and 


tj=Tj J -9,'8 equations (5.5.4) take the simpler form: 
Tj = HT (5.5.5) 


Comparing (5.5.5) with (4.5.14) on page 119, and recalling the argument that 
led to equations (4.5.15), we see at once that (5.5.5) cannot express a 
conservation law unless the right-hand side vanishes. Einstein remarks that it 
will vanish indeed “if no gravitational field is present”, but that otherwise 
equations (5.5.5) do not have “the form of a pure conservation principle”.!° 
This, he adds, is quite understandable from a physical point of view, for in the 
presence of a gravitational field matter cannot fulfil the conservation principles 
all by itself, because it exchanges energy and momentum with the field. The 
interaction is expressed by the right-hand side of (5.5.5). On account of it, 
Einstein regarded for some time the functions H', as “the natural expression 
for the components of the gravitational field”.7° The conservation of energy 
and momentum can only be fulfilled by matter and the field together. True to 
von Laue’s scheme (page 118), Einstein proposed to find a matrix (tj), 
satisfying the equations 

— (3/+t/)=0 (5.5.6) 
Ox? 
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and representing the dynamical state of the gravitational field. (In order to play 
this role, the tj had to be constructed only from the components of the metric 
and their derivatives.) 

The core of a theory of gravity that meets the requirements of the 1913 
“Outline” must be a law binding the metric to the distribution of matter. 
Evidently this law cannot be obtained by generalization of some special- 
relativistic law, assumed valid on local flat spacetimes, but must be found anew. 
Indeed, by saying how the spacetime geometry is determined by the existent 
matter, the law in question will furnish the necessary ground for building a 
cosmos out of local experiences. If the distribution of matter is represented by 
the energy tensor T, the law must be given by differential equations of the form 


G=kT (5.5.7) 


with Ga tensor field of order 2, constructed from the metric g, and Ka factor of 
proportionality, the “gravitational constant” of the theory. Einstein writes 
(5.5.7) componentwise, in terms of a chart x:! 


G4 = KT (5.5.8) 
Equations (5.5.8) must meet the following requirements: 


(i) In the limit of weak fields and low velocities they must tend to agree 
with Poisson’s equation (1.7.4), the well-attested field equation of 
Newtonian theory. 

(ii) G must be symmetric, like T; hence, G'/ = G#", 

(iit) In agreement with (5.5.4a) the covariant divergence of G must vanish 
everywhere; i.e., G'/.; = 0. 

(iv) Since (i) imphes that the equations (5.5.8) must be at least of the 
second order, but it would be premature, in the present state of our 
knowledge of the gravitational field, to countenance equations of a 
higher order, the G‘/ should depend exclusively on the components of 
the metric g and their first and second derivatives with respect to the 
coordinate functions.?? 


Grossmann explicitly says that the only tensor field on a Riemannian n- 
manifold whose components depend exclusively on the components of the 
metric and their first and second derivatives is the Riemann tensor R.?°> He 
surely knew that there is essentially only one way of deriving from it a tensor of 
order 2, namely, by contracting R with respect to the first and the last indices 
(p. 349, note 15) to form the Ricci tensor,?* given componentwise by 


R,; = Rhy (5.5.9) 


This implies that every tensor field on the spacetime (W, g) that meets 
requirement (iv) must be an ¥ (W)-linear combination of the Ricci tensor and 
the metric g, whose coefficients are constants, or scalar fields constructed from 
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R and g, One such combination, which I shall call the proper Einstein tensor, 
stands on the left-hand side of the field equations of General Relativity, when 
they are written after the scheme (5.5.8): 


RU —1Rg, gi) = —KTH (5.5.10) 


(where the R'/ are the components of the Ricci tensor in its contravariant form 
and we have assumed that «x > 0). Under a suitable physical interpretation, 
equations (5.5.10) fulfil condition (i).2> They also satisfy condition (ii), for the 
Einstein tensor is obviously symmetric. It can be shown, moreover, that they 
always meet requirement (iii), independently of the particular values of the 
metric components, for the covariant divergence of the Einstein tensor 
vanishes identically.2° Equations (5.5.10) are only a particular member of a 
one-parameter family of equation systems that satisfy conditions (i)}-(iv). The 
typical member of the family is: 


Ri — (GR gy, —Agii = — KT! (5.5.11) 


Equations (5.5.11) are the modified field equations of General Relativity 
proposed by Einstein in 1917 (and withdrawn in 1931).?” I shall therefore refer 
to the typical tensor field represented on their left-hand side as the modified 
Einstein tensor. David Lovelock (1971, 1972) has proved that, in an arbitrary 
Riemannian 4-manifold, the modified Einstein tensor is the only tensor field 
constructed in accordance with condition (iv) whose covariant divergence 
vanishes identically. Since it automatically fulfils condition (ii}—for both the 
metric and the Ricci tensor are symmetric—it turns out that the modified 
Einstein tensor constitutes a natural solution for G in equations (5.5.7) and 
that the field equations (5.5.11) furnish a mathematically determinate answer 
to the problem posed by the “Outline” of 1913. (The choice of a particular 
value for A, like the assignment of a definite sign to the factor x, must rest of 
course on physical grounds.) Had Lovelock’s theorem been discovered sixty 
years earlier, there can hardly be any doubt that Einstein would have filled in 
the left-hand side of equations (5.5.8) with the components of the Einstein 
tensor.7® He would then have established the field equations of General 
Relativity by a purely mathematical argument as soon as he designed a clear 
blueprint for the theory of gravity. But in actual fact the Einstein tensor was 
first constructed by Einstein himself—who finally found equations (5.5.10) 
guided by physical considerations—and it is unlikely that, without his 
prompting, the mathematicians would have paid any attention to such a 
special object. 

Although the Einstein tensor was unknown in 1913 it is somehow surprising 
that Einstein did not produce it at once, for it is a fairly simple combination of 
the metric and the Ricci tensor, and the inexorable vanishing of its covariant 
divergence is surely striking. It should be noted, however, that the fundamental 
law expressed by equations (5.5.4a) apparently did not play an important 
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heuristic role in Einstein’s search for the precise form of G in (5.5.7}—except 
indirectly, through its conjectural, non-covariant corollary (5.5.6). At any rate, 
in the papers of 1913-15 Einstein never lists condition (ii) on p. 000 among the 
requirements that G must meet. He may have omitted it because it was far too 
obvious; but also, perhaps, because he expected it to be fulfilled as a matter of 
fact, and not as a matter of mathematical necessity—in which case, it would 
impose a constraint on the metric g, but not on the form of G. But the main 
obstacle to the early discovery of the Einstein tensor was doubtless 
Grossmann’s verdict that if the left-hand side of equations (5.5.8) were built 
from the Ricci tensor they could not possibly agree in the Newtonian limit with 
Poisson’s equation. From this somewhat rash judgment, Grossmann rightly 
inferred that it would not be possible to find suitable generally covariant 
second order equations of the form (5.5.8), and Einstein accepted his expert 
opinion.?° 

Einstein noted that “since the definitive, exact gravitational equations might 
well be of an order higher than the second [ . . . ], there remains the possibility 
that the perfect exact differential equations of gravitation could be covariant 
with respect to arbitrary coordinate transformations”. But while the utility of 
speculating about such higher order equations remained severely limited by 
our inability to test them and perhaps even to integrate them, the theory of 
gravity would have to rest on a not generally covariant system of second order 
equations. Einstein postulated that they should be covariant at least under 
arbitrary linear transformations. After observing that, if the components of a 
(q, 0)-tensor field are denoted by T'--:%, the functions (gT'°-°%,), 
behave like tensor field components under linear transformations of the 
coordinates, he claimed that the operator that maps the former quantities to 
the latter is a generalization of the Laplacian. Hence—he concluded—if 
equations (5.5.8) are to agree in the Newtonian limit with Poisson’s equation, 
the G'/ must be equal, on a first approximation, to (g"g'/,),. Substituting 
« 1G" for T' in (5.5.4b) we see that the second term, that is, 


1 — ae. eit - see m 
ie J-9 Giz n4G? ® aK JV-9 Gije GO s)r (5.5.11) 


must be equal to a sum of partial derivatives. From these heuristic assump- 
tions, and neglecting the higher order terms omitted on the right-hand side of 
(5.5.11), Einstein derives, after some lengthy but straightforward analytic 
manipulation, the following system of field equations: 


(J -9 Ging" 9" 5), = K(T4, +t) (5.5.12a) 


where the gravitational “energy matrix” (t/) is given by: 


ey ee ee 
tL = eve. (6/9 Hi, Hi, — g? Hp, His) (5.5.12b) 
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in striking analogy with the energy tensor of the electromagnetic field 
(4.5.12).2° The theory of gravity based on equations (5.5.12) is known as the 
Einstein—-Grossmann theory. It is generally acknowledged that all its basic ideas 
originated with Einstein. 


5.6 Einstein’s Arguments against General Covariance: 1913-14! 


The method described on pp. 155 ff. enabled Einstein, in principle, to 
describe the influence of the gravitational field on any known physical process 
by means of generally covariant equations. In the Vienna lecture of September 
1913 he remarked that 

the whole problem of gravitation would therefore be solved if one succeeded in finding a set 


of equations, covariant under arbitrary coordinate transformations, which is satisfied by the 
quantities g;, that determine the gravitational field.” 


Einstein and Grossmann had not found such a set, and the field equations 
(5.5.12) proposed by them were not generally covariant. In the main text of the 
“Outline” Einstein attributed their failure to the fact that they had—for good 
reasons—restricted their search to second-order equations.’ But obviously he 
was not happy about this state of affairs. In a letter to H. A. Lorentz of August 
14th, 1913, in reply to the latter’s “warm” reception of the “Outline”, Einstein 
confessed that his own “confidence in the admissibility of the theory was still 
shaky”. The action of the gravitational field on other physical processes was 
satisfactorily dealt with, for it had been subjected, by means of the absolute 
differential calculus, to generally covariant laws. Thereby the gravitational 
field g,; showed itself “so to speak as the framework that supports everything 
[das Gerippe an dem alles hdngt |”. 
But the gravitational equations themselves unfortunately do not have the property of general 
covariance. Only their covariance with respect to linear transformations is assured. All our 
confidence in the theory rests however on the belief that acceleration of the reference frame 
is equivalent to a gravitational field. Hence, unless all the equation systems of the theory, 


including equations (18), admit other transformations beside the linear ones, the theory 
rebuts its own starting point and does not stand on solid ground.* 


Two days later, however, on August 16th, Einstein wrote to Lorentz: 


To my great joy I found yesterday that the reservations regarding the theory of gravity stated 
in my last letter and in the paper itself are not warranted. The question can apparently be 
solved as follows: The expression of the energy principle for matter and the gravitational 
field together is an equation of the form (19), ie., of the form EeT,,/¢cx’ = 0. Starting from 
this assumption I set up equations (19).* Examination of the general differential operators of 
the absolute differential calculus shows however that an equation of this structure can never 
be absolutely covariant.® 


By postulating the said equations we therefore implicitly specialize the 
available selection of reference systems: we restrict ourselves to those in which 


the principle of energy and momentum conservation holds good “in this 
form”. 
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Summing up: by postulating the conservation principle we are led to a highly determined 
choice of the reference system and of the admissible transformations.’ 


This is precisely the explanation given by Einstein in the Vienna lecture of 
September 1913 for his and Grossmann’s inability to provide generally 
covariant gravitational field equations.® However, in a footnote to the printed 
version of the lecture—published on December 1 Sth, 1913—he announced the 
discovery of a much more cogent explanation.’ This must have been found 
before November 2nd, 1913, for on that day Einstein wrote to Ludwig Hopf: 


I am now quite happy [sehr zufrieden] with the gravitation theory. The fact that the 
gravitational equations are not generally covariant, which sometime ago still worried me 
inordinately [welche mich vor einiger Zeit noch ungemein stérte] has turned out to be 
inevitable. It is easily proved that a theory with generally covariant equations cannot exist if 
we demand that the field be mathematically completely determined by matter.'° 
The newly found proof was presented in one of the notes appended to the 
“Outline” when it finally came out in the Zeitschrift fur Mathematik und 
Physik.'! The following paraphrase uses our terminology and notation. 
Let the spacetime (¥, g) include an open subset H on which the energy 
tensor T vanishes. (H is a spacetime region in which no “material process” 
occurs, i.e. a hole in the universe.) Let the closure of H be included in the 
domain of a chart z. Let g‘/ and T " denote the (i, j)-th components relative to z 
of the contravariant forms of g and T, respectively. We assume that the T", 
given outside H, completely determine the g‘/ on the entire domain of z. Let y 
be another chart defined on the same domain as z, and such that (i) outside H, 
y = z, but (ii) on H, y # zand the matrix of the components g'‘/ = g(dy’, dy’) 
is different from the matrix of the g/ = g(dz', dz’). Evidently, y and the g’” can 
be chosen in several ways. Let T’ denote the (i, j)-th component of T relative 
to y. Since T = 0 on H and z = y outside H, T’/ = T4 everywhere, on the 
common domain of z and y. Consequently, in the particular case under 
consideration, a given matrix (T") of components of the energy tensor 
corresponds to several distinct matrices (g'/), (g’‘4), . . . of components of the 
metric tensor. Contrary to our assumption, the T do not determine the g‘. 
The argument, as it stands, is impeccable, and certainly proves that a 
particular array of metric tensor components g" cannot be the sole solution of 
a generally covariant system of differential equations of the form 


Gi = Ti (5.5.8) 


—where the G may depend on the g" and their derivatives to an arbitrarily 
high order—even if the values of the T/ are specified everywhere. However, an 
objection will immediately suggest itself, if we have learned to regard the 
metric g and the energy tensor T as coordinate-free geometric objects. From 
this point of view, they may be considered as mappings which assign a matrix 
of scalar fields (components) to each spacetime chart—or to the corresponding 
tetrad field.!? Einstein’s argument shows that gcan take different values at two 
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charts z and y at which T takes the same value. This does not entail, however, 
that g is not uniquely determined by T. For a mapping can be determined by 
another even if its value varies on parts of its domain on which the latter is 
constant.'3 

I have no evidence that in 1913--14 anyone raised the foregoing objection.'* 
Nevertheless, in two papers published in 1914 Einstein modified his argument 
in a way that seems designed to forestall it. Instead of a coordinate 
transformation that changes the components of the metric while preserving 
those of the energy tensor, he now considers a point transformation, by which 
all tensor fields are naturally “dragged along”, but which is so contrived that 
the metric is mapped on a different metric, while the energy tensor is mapped 
on itself. In this form the argument is no longer liable to our earlier stricture. 
Einstein’s clearest formulation of the modified argument is to be found in the 
second paper.!> The two charts that agree outside the hole H but disagree 
inside it are here called K and K’. The matrices of the components of the 
covariant metric tensor relative to K and K’ are denoted by (g,;) and g’;;); their 
typical values, by G(x) and G'‘(x’), where x and x’ presumably stand for the real 
number quadruples assigned, respectively, by K and K’ to an arbitrary 
worldpoint P. Einstein grants that G(x) and G'(x’) are merely two different 
representations of the same gravitational field, referred to two coordinate 
systems. However, if we substitute “the coordinates x, for x’, in the functions 
gi; to form G’(x), we obtain a description, relative to K, of a different 
gravitational field, which does not agree with the one originally given, relative 
to the same coordinate system, by G(x). Einstein proceeds: 


If we now assume that the differential equations of the gravitational field are generally 
covariant, they are satisfied by G’(x’) (relative to K’) if they are satisfied by G(x) relative to K. 
Hence they are also satisfied, relative to K, by G'(x). There are therefore two distinct 
solutions, G(x) and G’(x), relative to K, although both solutions agree on the boundary [of 
the hole H]. In other words, the course of events in the gravitational field cannot be fixed 
univocally by generally covariant differential equations.'® 


The meaning of this text will become clearer if we reintroduce our notation. 
Let zand y denote Einstein’s coordinate systems K and K’. G(x) stands then for 
the value of the matrix of the components g,; = g(@/6z', 0/dz’) at an arbitrary 
point P mapped by z on the number quadruple z(P) (= x, in Einstein’s 
notation); while G’(x’) denotes the value of the matrix of the components 
9’; = @(0/dy', 6/dy’) at the same point P, which is mapped by y on y(P) (= x’, 
in Einstein’s notation). How are we to understand Einstein’s prompting to 
“substitute the coordinates x, for x’, in the functions g’;;’, so as to form G’(x)?I 
can think of two alternative interpretations. Either (i) we are asked hereby to 
evaluate the matrix (g’,;) = (g(@/dy’, 0/dy’)) at the point mapped by y on 
x =2(P), ie. at the point y~!-z(P); or (ii), less naturally but more probably, we 
are required to substitute not only the unprimed for the primed coordinate 
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values (i.e., 2(P) for y(P)), but also the unprimed for the primed coordinate 
functions (i.e., z* for y*), and hence to evaluate the matrix (g(@/0z', 6/éz*)) at 
y”}+2(P).!7 In either alternative, Einstein’s move shifts our attention from the 
coordinate transformation y:z~!, which could only change the chart- 
dependent representation of the given metric, to the point transformation 
y~4+z, which evidently can induce a change of the metric itself. Set f = y~!-z. 
fis a diffeomorphic permutation of an open submanifold U < W, which 
contains the closure of H.'® The conditions prescribed for y and z imply that f 
is the identity on U — H, but is arbitrary on the interior of H. Hence on H the 
originally given metric g will generally differ from its “pull-back” f*g.1° It can 
be readily shown than Einstein’s G’(x), in our second, preferred interpretation, 
is equal to the value at P of the matrix of the components of f *g relative to z.?° 
Does Einstein’s argument prove that, under its assumptions, f *g will satisfy 
any set of equations of the form (5.5.8) which is satisfied by g? To show that it 
must indeed be so let us first define the “drag-along” h, w of a covariant tensor 
field w on an n-manifold M, by an arbitrary diffeomorphism h: M > M'. We 
set h,w equal to (h~')*«, the “pull-back” of w by the inverse diffeomorphism 
h~'. h,@ is therefore a covariant tensor field on M’, of the same order as w.”! 
Returning now to the metric g and the energy tensor T on U, and the 
diffeomorphism f of U onto itself, we see at once that, while f,T = T on all U, 
f,8 generally differs from g in the interior of the hole H. According to our 
assumptions, U is mapped by the charts z and y onto the same region of R* (see 
note 18). We denote this region by V. We have also assumed that, if P lies in H, 
z(P) generally differs from y(P). On the other hand, since f = y~'-z, y assigns 
to f(P) the same real number quadruple which z assigns to P. Hence the 
spacetime region U is portrayed by z in V in exactly the same way as tts 
deformed image f(U) is portrayed in V by y. In particular, the “drag-along” z,g 
of the metric g by z is exactly the same tensor field on V as the “drag-along” 
Vata of the metric f,g by y.2* The same holds, of course, mutatis mutandis, for 
the energy tensor T. Consequently, any system of differential equations that 
ties z,g toz, T and yields gas the metric on U determined by T will, by the same 
token, tie y,f,g to y,f, T and yield f,g as the metric on U determined by f,T. 
Since f, T = T, it turns out that a given energy tensor is compatible with more 
than one metric. 

As we shall see in the next section, in November 1915 Einstein proposed with 
unconcealed joy a generally covariant system of field equations, through 
which, he believed, the spacetime metric responsible for the gravitational field 
was uniquely determined by the energy tensor. How did he dispose of his 
mathematical proof that no such thing was possible? His paper of November 
18th, 1915, on the perihelion advance of Mercury, contains a paragraph that 
expresses his new view on the matter. After setting up a system of gravitational 
field equations in the absence of matter (T = 0), covariant with respect to 
coordinate transformations with Jacobian determinant equal to 1, he remarks 
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that the metric in the vacuum surrounding the sun might not be uniquely 
determined by these equations and the given solar mass. 


This follows from the fact that these equations are covariant with respect to arbitrary 
transformations with determinant 1. However, it might be justified to assume that all the 
solutions can be reduced to one another by means of such transformations, so that (for a 
given set of boundary conditions) they differ from each other only formally, but not 
physically.?3 


Einstein’s newly acquired perception that the alternative solutions of a system 
of covariant equations may in fact be indistinguishable from a physical point 
of view motivates his rejection of the “hole” argument against general 
covariance. He reexamined this argument in two letters sent, respectively, to 
Paul Ehrenfest on December 26th, 1915, and to Michele Besso on January 3rd, 
1916. The letter to Ehrenfest explicitly refers to the section of Einstein (19145) 
that we discussed above: 


Everything is correct in Section 12 of my paper of last year (the first three paragraphs) up to 
the italicized part at the end of the third paragraph [quoted above, on p. 164—R.T.]. 
Absolutely no contradiction with the uniqueness of events follows from the fact that both 
systems G(x)and G’(x), referred to the same system of reference, satisfy the conditions for the 
gravitational field. The apparent cogency of this argument disappears at once if one 
considers that 1) the system of reference signifies nothing real, 2) that the (simultaneous) 
realisation of two different g-systems (better said: of two different gravitational fields) in the 
same region of the continuum is impossible by the nature of the theory. 

The following consideration should take the place of Section 12. The physically real in 
what happens in the world (as opposed to what depends on the choice of the system of 
reference) consists of spatiotemporal coincidences. For example, the point of intersection of 
two worldlines, or the assertion that they do not intersect, is real. Such assertions referring to 
the physically real are not lost through any (single-valued) coordinate transformation. If two 
systems of g,; (or generally of any variables used for the description of the world) are so 
constituted that the second can be obtained from the first by a pure space-time 
transformation, then they are fully equivalent. For they have all spatiotemporal point 
coincidences in common, that is, everything observable. 

This argument shows at the same time how natural is the demand for general 
covariance.?* 


The letter to Besso repeats the same ideas more concisely and perhaps more 
elegantly.*° 

To understand what Einstein had in mind when he wrote these letters let us 
again consider the spacetime region U, containing a hole H. The gravitational 
field in H is probed, in principle, by means of test particles describing timelike 
geodesics of the Riemannian 4-manifold (U, g). Denote one such timelike 
geodesic by y. The gravitational field determined by gat each worldpoint Peg 
is disclosed by y’s behaviour vis-a-vis nearby timelike geodesics on a small 
neighbourhood of P.?° In order to perform its probing mission, the particle 
along y must be somehow linked to instruments placed in the matter-filled 
region outside H. Thus, for example, the exact location of a worldpoint 
Pey c Hcan beascertained by means of three bouncing signals, emitted at Q,, 
Q,, Q3 from suitable devices in U — H, and recovered at Q}, 04, Q3. Let o, 
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denote the worldline of the signal from Q, through P to Q; (r = 1,2,3). 
Let f be, as before, a diffeomorphism of U onto itself, that agrees with the 
identity on U —H. f maps the Riemannian manifold (U, g) isometrically onto 
(U,f,g).?’ f-y is therefore a timelike geodesic of (U,f,g). The gravitational 
field determined by the metric f,g at f(P)ef-y is disclosed by f- y’s behaviour 
vis-a-vis nearby geodesics—of (U, f,g}—on a small neighbourhood of f(P). 
The location of f(P) can be ascertained by three bouncing signals, of the same 
kind used for locating P, emitted at f(Q,) = Q, in U — H and recovered at f(Q’) 
= Q,. The worldline of each such signal in (U,f,g) can be no other than f-¢,, 
which is identical with o, in U — H and hence in the vicinity of both Q, and Q'. 
The length of c, from Q, through P to Q} in (U, g) is of course the same as the 
length of f- 0, from Q, through f(P) to Q; in (U, f,g). There is evidently no way 
of distinguishing between an event P on the worldline y of a test particle 
traversing the hole H in (U, g),and the matching event {(P) on the worldlinef: y 
in (U, f,g). The 4-manifold U, endowed with the metric g and the energy tensor 
T, and the same manifold U, endowed with f,g # g and with f,T = T, do 
represent but one physical situation. 

John Stachel, who—as far as I know—was the first to understand the 
significance of Einstein’s reappraisal of the “hole” argument in the letters to 
Ehrenfest and Besso,”® conjectures that Einstein’s self-criticism in those letters 
rests on the tacit assumption that the points of spacetime where no matter is 
present cannot be physically distinguished except by the properties and 
relations induced by the spacetime metric. This assumption would certainly 
account for Einstein’s remark (in the letter to Ehrenfest) that the simultaneous 
realisation of two different metrics in the same region of spacetime “is 
impossible by the nature of the theory”, and for his even stronger statement (in 
the letter to Besso) that to attribute two different metrics to the same spacetime 
manifold is nonsense. The idea that spacetime points are what they are only by 
virtue of the metric structure to which they belong agrees well with the thesis, 
common to Leibniz and Newton, that “it is only by their mutual order and 
position that the parts of time and space are understood to be the very same 
which in truth they are”, for “they do not possess any principle of individuation 
apart from this order and these positions.”*? The assumption that Stachel 
conjecturally attributes to Einstein has some very important consequences. In 
the light of it, it is obviously meaningless to speak in General Relativity of a 
spacetime point at which the metric is not defined. (This view is anyhow 
prevalent in the literature on the so-called singularities of relativistic 
spacetimes—see Section 6.4) It also makes havoc of the controversial 
philosophical doctrine (examined in Section 7.2) according to which the metric 
of a relativistic spacetime is not a matter of fact, but of mere convention. 
Finally, if conjoined with a realist conception of spacetime as the aggregate of 
its points, it entails that those points are subject to the philosophically alluring 
but notoriously difficult principle of individuation by form.°® Such a 
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conclusion is clearly at variance with the fashionable semantic theory that 
conceives of proper names as “rigid designators”, denoting the same individual 
in many alternative diversely structured “possible worlds”.*! Proper names 
cannot function in this way if the very individuals which are their referents owe 
their identity to the structure in which they are enmeshed. 

Although I am not addicted to this brand of semantics, I do not believe that 
it clashes with the existence of generally covariant field equations of gravity, as 
the foregoing considerations apparently imply. I shall therefore propose a 
different way of dealing with Einstein’s “hole” argument against general 
covariance, which does not assume that spacetime points can only be 
physically distinguished by means of their metric properties and relations. I 
must emphasize, however, that I do not claim that Einstein ever countenanced 
what I am about to say, and that I do not dispute the historical accuracy of 
Stachel’s reading of the letters to Ehrenfest and Besso. I submit that two 
physical objects or states of affairs can be distinguished from each other either 
(i) empirically, in the blunt, not further analyzable way in which two mint-fresh 
dimes or two blank 3 x 5 cards are regarded as unquestionably different, even 
if they look alike in every respect; or (ii) rationally, if, on any grounds, they are 
equated to or represented by structurally unequal conceptual systems. In the 
case in point, the physical situations represented respectively by the manifold 
U with metric g and energy tensor T, and by the same manifold U with metric 
f,8 and energy tensor f,T, are, as we have just seen, observationally 
indistinguishable. Since the structures (U,g,T)> and <U,f,g¢,f,T> are 
isomorphic, the situations they stand for are, moreover, conceptually 
indistinguishable—at any rate in terms of this representation. Hence, the said 
situations are, as far as our assumptions go, perfectly indiscernible, and 
therefore must be regarded as identical.*? In the view I have just put forward, 
the onus of individuating the points of spacetime does not rest with the metric, 
which is a structural feature of the world. The role of structure is not to 
individuate, but to specify; and of course it cannot perform this role beyond 
what its own specific identity will permit, that is, “up to isomorphism”. It is 
only on non-conceptual, extra-structural grounds that two isomorphic 
structures can be held to represent two really different things.>* 


5.7 Einstein’s Papers of November 1915 


In March 1914 Einstein wrote to Besso about a “novelty” concerning 
gravity. Let the Einstein—-Grossmann equations (5.5.12) be referred to a chart 
x. If we differentiate the left-hand side of (5.5.12a) with respect to x/and sum 
over j we obtain four functions: 


B= (/— 9 Gin9" 9" s)arj (5.7.1) 
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Equations (5.5.12a) and (5.5.6) entail that 
B, =90 (5.7.2) 


Since the B, are not the components, relative to x, of a vector field, equations 
(5.7.2) are not generally covariant and must be regarded as a set of conditions 
restricting the choice of the chart x. The Einstein-Grossmann equations 
evidently cannot hold good if referred to a chart which does not satisfy the 
conditions (5.7.2). The good news that Einstein reported to Besso was that he 
had succeeded in proving “by a simple calculation, that the gravitational 
equations [5.5.12] are valid for every reference system which fulfills those 
conditions”.' Einstein took this to mean that the covariance group of the 
Einstein-Grossmann equations was as large as one could wish.” The proof was 
given by Einstein and Grossmann (1914), and again, at greater length, by 
Einstein (19145) alone. We shall not go into it. Its main interest lies in the fact 
that in the course of it the Einstein-Grossmann equations are derived from a 
variational principle. Credit for this methodological innovation is given to 
Paul Bernays.* Unfortunately, the derivation is not rigorous. But it did set an 
important historical precedent. 

After the meeting of the Prussian Academy of October 29th, 1914, at which 
Einstein submitted his paper (19146), he did not publish any further 
contributions to the theory of gravity for a whole year—though he did not 
remain entirely silent on the subject, which he discussed, for instance, at 
G6ttingen in the Summer of 1915, being “extremely pleased that everything 
was understood in detail”.* Then, on the four Thursdays of November 1915, 
he presented to the Prussian Academy four papers—“On the General Theory 
of Relativity”, “On the General Theory of Relativity (Addendum)”, 
“Explanation of the Perihelion Motion of Mercury by the General Theory of 
Relativity” and “The Field Equations of Gravitation”—in which he discarded 
the Einstein-Grossmann equations and proposed instead, in quick succession, 
three alternative systems of gravitational field equations, the last two of which 
were generally covariant.* Let us take a look at this wondrous development. 

In a letter to Arnold Sommerfeld of November 28th, 1915, after declaring 
that he had just gone through “one of the most exciting, most strenuous times 
of my life, also one of the most rewarding”, Einstein explains why he had to 
admit that his “former gravitational field equations [i.e. eqns. (5.5.12)] were 
completely unfounded”. The equations yield less than half the observed value 
of Mercury’s secular perihelion advance (18”, instead of 45”) and, in violation 
of the Equivalence Principle, they are not satisfied by the inertial force field on 
a uniformly rotating system. Moreover, the covariance requirement of the 
1914 papers does not determine the Lagrangian of the variational principle 
from which the field equations were derived, but allows the choice of an 
arbitrary Lagrangian.® 

In the first paper of the series Einstein had said that the indeterminacy of the 
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Lagrangian had caused him to lose his confidence in his earlier field equations 
and to look for “a way of restricting the possibilities in a natural fashion”. He 
was thus led back to require a “more general [allgemeinere ] covariance of the 
field equations”, a requirement which three years before he had given up “with 
a heavy heart”.”? He added: 


Just as the Special Theory of Relativity is based on the postulate that its equations must be 
covariant under linear orthogonal transformations, the theory to be presented here rests on 
the postulate of the covariance of all equation systems under transformations whose Jacobian 
determinant is equal to 1.8 


Let us say that an atlas of an n-manifold is a J-atlas if the coordinate 
transformations between any two of its charts has a Jacobian determinant 
equal to 1. Einstein’s new theory tacitly presupposes that the maximal atlas of 
the real spacetime manifold (W, g) contains at least one J-subatlas.? Let A, 
denote a maximal such subatlas, i.e. one that is not contained in another J- 
subatlas. The assignment to each chart of A, of an ordered set of 47*” smooth 
functions on its respective domain will be said to define a J-tensor field of type 
(q, r) on spacetime, if such ordered sets of functions are mutually related like 
the components of an ordinary (q, r)-tensor field relative to the same charts (i.e. 
if they “transform” like the latter under the coordinate transformations of A,). 
The members of each such ordered set of functions are then the components 
of the J-tensor field relative to the pertinent chart. Thus, for instance, the 
determinant g of the metric components g;; relative to any given chart of A, 
defines a J-tensor field of type (0, 0) or J-scalar field, for it is invariant under 
coordinate transformations whose Jacobian equals 1 (see p. 317, note 18). 
Evidently, if the T# in equations (5.5.1) are the components, relative to a chart 
xéA,, ofa J-tensor field T, of type (q, r), the T§., given by those equations are 
the components of a J-tensor field of type (gq, r+ 1), which we may call the J- 
covariant derivative of T. 

The rationale for Einstein’s new theory and its curious covariance postulate 
will flow from the following considerations. Einstein recalls that “all 
covariants formed exclusively from the g,,; and their derivatives” can be 
obtained from the Riemann tensor R. The problem of gravity directs our 
attention to tensors of order 2, constructed from R and the metric g. The 
symmetries of R imply that “such a construction can be effected in only one 
way”, yielding the Ricci tensor.!° The matrix of the Ricci tensor’s components 
relative to an arbitrary chart x can be regarded as the sum of two matrices:'' 

R,; = U,;+5S); (5.7.3a) 
with 

U,,= —Tiat PhO (5.7.35) 
and 

$)= igo Peel 


ij ik, j 


= (log —g).;,;-P™log J-g).m (5.7.3) 
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Suppose that x belongs to the J-atlas A;. Then, log ./ —g defines a J-scalar 
field. The components, relative to x, of its first J-covariant derivative are then 


its four ordinary derivatives with respect to the x/, (log J/-9)}: The 
components, relative to x, of its second J-covariant derivative are therefore the 
S; ;. Hence, the matrices (S; ;) and (R; ;) are arrays of J-tensor field components, 
relative to x, and their difference, the matrix (U; ,), is likewise such an array. Let 
T;; denote the components of the covariant energy tensor relative to the 
arbitrary chart x € A,. The field equations of Einstein’s new theory of gravity 
are: 

U.,= 


ij 


ay ee 


ij 


(5.7.4) 


As a system of equations between matching J-tensor field components, (5.7.4) 
is obviously covariant under coordinate transformations whose Jacobian is 
equal to 1. Einstein motivates his choice of this system of field equations by 
deriving them from the variational principle 


ac —Kg"*T,,)d*x = 0 (5.7.5a) 
where the variation affects the metric g but not the energy tensor T, and 
L=g"VijUi; (5.7.5b) 
The variational equation (5.7.5a) will hold only if 
0 [ é6L OL 
ax* (ar) gil K tj ( ) 


Noting that @L/dg? = —Ik 7, and that 0L/dg),, = —T¥,,’? (5.7.6) can be 
rewritten as: 


—Th,. th rn = = —KT;; 


(5.7.6*) 


which is of course the same as (5.7.4). The meaning of Einstein’s variational 
argument is obscured, however, by the fact that L is not a J-scalar field. 

Einstern shows next that the new theory satisfies the “conservation 
principle” 


(Ti+t{), =0 (5.7.7) 
if the energy matrix of the gravitational field, (tj), is suitably redefined. He sets 


xt] = $0492.05, 9° TRG, (5.7.8) 


an expression reminiscent of the earlier definition (5.5.12b).13 Multi- 
plying both sides of (5.7.8) by 5§ and summing over k we verify that 
xt) = g*I,13, = L. The Lagrangian in (5.7.5a) is therefore equal to 
k(t} — T4). 
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Multiplying both sides of (5.7.6*) by gi* and summing over i, we obtain:'* 
- Ca —9? TT 3, = - xT? (5.7.9) 
If Kth is subtracted from each side, (5.7.9) becomes: 
—(g"T) , —ZKtfo? = —K (Th +t") (5.7.10) 
In view of (5.7.7), equations (5.7.10) imply that:!° 
(g'"4 - Ktk) =0 (5.7.11) 


In other words, (g'*,, —xtk) = const. Einstein sets this constant equal to 0. 
Let us now multiply both sides of the field equations (5.7.6*) by g'/ and sum 
over i and j. We obtain:'® 


(gi (log / —@),;). +944; Kt) = —xT! (5.7.12) 
Since (g/J;; —xt/) has been set equal to 0, (5.7.12) amounts to 
(g'i (log V=-9)pi = —KTi (5.7.13) 


Equation (5.7.13) implies that if g is constant, the invariant trace T/ of the 
energy tensor vanishes. This is hard to accept. For instance, the energy tensor 
of a continuous distribution of incoherent matter (pressureless dust) is given 


by " ie 
Ti = puU'U (5.7.14) 


where p is the proper energy density and the U! are the components of the 
world-velocity field (pp. 119, 120, 157). Since g, ;U'U/ = 1, the trace T/ is equal 
to p, and can only vanish ina perfect vacuum. Einstein concludes therefore that 
“it is impossible to choose the coordinate system so that \/—g equals 1”."’ 
Einstein must have felt uneasy about this highly artificial restriction for a 
week after submitting his (1915a) he came up with a physical hypothesis 
designed to make the restriction unnecessary. He conjectures that the trace of 
the energy tensor might in fact vanish everywhere if all the gravitational energy 
hidden in every chunk of matter is deducted. The electromagnetic energy 
tensor of classical electrodynamics does have a vanishing trace,’® and Einstein 
seeks to justify his bold conjecture by recalling that macroscopic matter is not 
something “primitively given, physically simple”, and that quite a few 
physicists “hope to reduce matter to purely electromagnetic processes”. 
Although such processes would doubtless obey the laws of a new, non- 
Maxwellian electrodynamics, we may just as well assume that “in such a 
perfected electrodynamics the trace of the energy tensor would also vanish”.'° 
This hypothesis can be reconciled with the fact that the energy tensor of dust 
(5.7.14) does not have a vanishing trace, because 
gravitational fields are liable to be an essential ingredient of the “matter” to which the above 


expression refers. Thus T/ can be apparently positive for the entire system, while in reality 
only T/+t/ is positive, while T/ vanishes everywhere.?° 
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By assuming that “in fact the condition T/ = 0 is universally fulfilled”, we 
are at last in a position to formulate the gravitational field equations in 
generally covariant form. Einstein writes them as follows: 


R,, = —KxT 


ty 


(5.7.15) 


ij 
where the R; ; are the components of the Ricci tensor, “the only tensor available 


for setting up generally covariant equations of gravity”.?' The theory based on 
equations (5.7.15) places no restrictions on the choice of spacetime charts. In 


particular, if we choose a chart x relative to which J —g = 1,the S; ; defined as 
in (5.7.3c) in terms of x will all vanish. Therefore, the new field equations 
(5.7.15), when referred to such a chart, are equivalent to the field equations 
(5.7.4) of Einstein (1915a), and all the consequences derived in that paper, 
including the validity relative to a suitable reference system of the “conserv- 
ation principle” (5.5.6), still hold good in the theory of Einstein (19155). 

In the absence of matter, T;; = 0, and equations (5.7.15) amount to 


R;; = 0 (5.7.16) 


Einstein’s third paper of November 1915 gives an approximate solution of 
(5.7.16) for a spherically symmetric static metric that becomes flat at infinity.?? 
This is a reasonable model of the gravitational field surrounding the sun. After 
noting that his results agree with his earlier prediction of the gravitational 
redshift of signals coming from the sun to the earth, Einstein uses them for 
calculating (i) the deviation that will be experienced by a light-ray grazing the 
surface of the sun, and (ii) the secular advance of Mercury’s perihelion in the 
absence of perturbing factors. The value of (i) turns out to be 1.75”, or twice 
the amount predicted by Einstein in 1911. The new figure was confirmed, to 3 
parts in 10, by the British Equatorial Expedition of 1919.7* For (ii), Einstein 
obtains 43”, which, he says, is “in complete agreement” with the anomaly of 
45” + 5” per century, unexplained by the Newtonian theory.”* 

The theory of gravity of Einstein (19155) rests on a groundless hypothesis 
about the fine structure of matter. It is as if Einstein’s usual wariness of 
microphysical speculation had temporarily subsided in the excitement over the 
discovery of genuine generally covariant field equations. However, in spite of 
the theory’s admirable success in explaining the anomaly of Mercury, Einstein 
did not rest content with it, but very soon proposed still another system of 
generally covariant gravitational field equations, namely: 


Rj = —K(T,;-29;;T?) (5.7.17) 
Equations (5.7.17) are known as the Einstein field equations or the field 
equations of General Relativity. In the absence of matter they are obviously 
equivalent to (5.7.16), so that, as Einstein is quick to point out, the predictions 
and post-dictions of his (1915c) are also entailed by them. They are equivalent, 
too, to the system (5.5.10), on page 160, which equates the energy tensor to the 
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proper Einstein tensor.?° Equations (5.7.17) therefore imply, by the sheer force 
of algebra, that 


TO (5.7.18) 


However, Einstein does not mention this remarkable fact in his (1915d). On 
the other hand, he devotes almost one third of this brief paper to showing that 
his latest theory satisfies, no less than its predecessors, the non-covariant 
conservation law 

(Ti+t}) ,=0 (5.7.19) 


if “the ‘energy tensor’ of the gravitational field”,?® tj, is given by equations 
(5.7.8), as in his (1915a). Since the t/ are not tensor components, any argument 
involving them must be referred to a definite chart. As in his (19155), Einstein 
chooses a chart x, relative to which the determinant g of the metric 
components is equal to — 1. He uses the same special chart to explain why he 
added the new term (1/2)g;;Th to the right-hand side of (5.7.15) to form the 
new field equations (5.7.17). The explanation is as follows. When referred to 
the said chart, I, vanishes and (5.7.17) becomes:” 


—That aes OH = —«(T;;-49;,;TH) (5.7.20) 


Einstein subjects (5.7.20) to the same operations applied in his (1915a) to 
(5.7.6*) to obtain (5.7.11) and (5.7.12).28 Multiplying both sides of (5.7.20) by 
gi", summing over h and subtracting t* from the resulting expressions, we 
obtain: 


—(g" Tt), = —K((Th +t) —45"(TK+ th) (5.7.21) 


This is the analogue, in the present theory, of equations (5.7.10).?° The fact that 
the energy tensor of matter and the gravitational energy matrix play identical 
roles in (5.7.21) was seen by Einstein as a considerable improvement with 
respect to the earlier system. In view of (5.7.19), equations (5.7.21) imply that: 


(grin —K (Th + th); = 0 (5.7.22) 


This corresponds to (5.7.11), the equations that placed Einstein in the 
alternative of having to forbid coordinate systems relative to which g = —1 or 
to postulate that the trace of the energy tensor is a universal constant. The 
dilemma does not arise in his latest theory, for the operations that transformed 
(5.7.6*) into (5.7.12) have a very different effect on (5.7.20). If we multiply both 
sides of the latter by g‘/ and sum over i and j we are led to: 


gi; —K(Th+ th) = 0 (5.7.23) 


which evidently entails equations (5.7.22). The latter, therefore, do not add 
anything new to (5.7.23), and “it is not necessary to make any further 
assumptions regarding the energy tensor of matter, except that it satisfies the 
energy-momentum principle [5.7.18]”.*° 
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In the long and important paper on “The Foundation of the General Theory 
of Relativity”, submitted on March 20th, 1916, Einstein retraces, with the 
benefit of hindsight, the main line of thought of his November papers. Because 
the Ricci tensor is the only tensor of order 2 “which is formed from the g; ;and 
their derivatives, contains no derivatives of order higher than the second, and is 
linear in the second derivatives”,?' he deems it justified to postulate the 
vacuum field equations (5.7.16). Once again he refers them to a special chart x 
relative to which ry —g = 1. (5.7.16) then become (5.7.20), with 0 on the right- 
hand side. These equations he derives from the variational principle (5.7.5), 
which, in the absence of matter, reads 5 { Ld*x = 0. By the same procedure that 
carried us from (5.7.20) to (5.7.21), Einstein infers that, in a vacuum, 


~(g"T*) , = ~K(t"—454tk) (5.7.24) 


He then considers the gravitational action of a continuous matter distribution, 
represented by the energy tensor T. He introduces a physical argument not 
found in the November papers, but clearly traceable to a fundamental idea of 
the 1913 “Outline” (see p. 158). 


If we consider a complete system (e.g. the solar system), the total mass of the system, and 
hence also its total gravitational action, will depend on the total energy of the system, and 
therefore on the ponderable and the gravitational energies taken together. This can be 
expressed by replacing in [5.7.24] the energy components t? of the gravitational field alone 
by the sums T* +t? of the energy components of matter and the gravitational field.*? 


The substitution proposed by Einstein yields (5.7.21), from which equations 
(5.7.20) can be recovered in their above form. From them Einstein derives 
(5.7.19) and, for the first time, the energy-momentum principle (5.7.18).3? 
The variational principle invoked by Einstein in his (1916a) is liable to the 
same strictures as its homologue in Einstein (1915a), insofar as the Lagrangian 
L given by (5.7.55) is not a scalar density and consequently is not an appro- 
priate integrand for a generally covariant law. But if we set L = ,/(—g)Ri, 
the variational equation 6jLd*x = 0 is generally covariant and entails the 
vanishing of the Einstein tensor and hence the Einstein field equations in the 
absence of matter.** The Einstein field equations in the presence of matter 
follow from 6{Ld*x = Oif L = \/(—g)Rf+«g,,t', and the variation affects 
the metric g but not the energy tensor density T.>> A scalar density Lagrangian 
and a generally covariant variational principle were first used in the theory of 
gravity by David Hilbert in a paper on “The Foundations of Physics”, 
submitted to the Gottingen Gesellschaft der Wissenschaften on November 
20th, 1915, five days before Einstein presented “The Field Equations of 
Gravitation” in Berlin.*° Hilbert combines Mie’s grand design for an 
electromagnetic theory of matter with Einstein’s outline of a geometric theory 
of gravity. Hilbert’s variational argument yields in fact the Einstein field 
equations (5.5.10) if the energy tensor in the right-hand side is identified with 
the electromagnetic energy tensor of Mie’s theory. This is indeed an 
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unwarranted “specializing assumption” about the constitution of matter, as 
Einstein suggests in his (1916f). But Hilbert’s method of proof does not 
depend on the particular form of the energy tensor or on the laws that govern 
its behaviour. 


5.8 Einstein’s Field Equations and the Geodesic Law of Motion. 


With the discovery of field equations (5.7.17) Einstein finally achieved the 
goal he had set himself in the first paper with Grossmann. He had produced a 
field theory consonant with the local validity of Special Relativity, in which the 
gravitational action of matter on matter was mediated by the spacetime 
geometry. Einstein’s theory agreed with Newton’s in all the familiar cases in 
which the latter was well confirmed, and solved the vexing anomaly of 
Mercury. The new idea of gravitation as geometry dispelled the mysteries 
attendant on Newtonian attraction, explaining at one stroke why it was 
ubiquitous and there was no shielding against it, why it acted equally on all 
kinds of bodies and behaved like the force of inertia over small spacetime 
regions. Persuaded of the theory’s strengths and seduced by its beauty, some of 
the best minds in Europe, young and old, Felix Klein and Hendrik Anton 
Lorentz, David Hilbert and Elie Cartan, Karl Schwarzschild, Willem de Sitter, 
Tullio Levi-Civita, Hermann Weyl, Arthur Eddington, Erwin Schrodinger, 
sought to master it and to further its development. We shall come across a few 
of their contributions in Chapter 6. But the relation between the Einstein field 
equations and the law of the motion of matter in a gravitational field belongs 
to the foundations of the theory and it is therefore appropriate that we deal 
with it in this chapter, even though, due to the notorious difficulty of the 
subject, I can barely hint at it.’ 

The classical field theories of gravity and electricity rest on two sets of laws: 
the field equations state how a given distribution of material sources (masses, 
charges) generates a field of force on all space, while the equations of motion 
describe how a given field affects the movements of the material objects 
(masses, charges) liable to its influence. (See equations (1.7.4) and (1.7.5), (2.2.1) 
and (2.2.2).) Because the field equations are linear, the contributions of 
different sources to the total field are mutually independent and can be added 
vectorially. It would seem, therefore, that any aggregate of sources will yield an 
admissible solution of the field equations and that the latter will not constrain 
the movement of the sources in any way. If such is the case, the equations of 
motion do not depend on the field equations, but must be postulated 
separately. When Einstein, who understood the classical theories in this way, 
put forth his plan for a new field theory of gravity, he postulated the Geodesic 
Law of motion (5.4.7*) from the outset, apparently assuming that any suitable 
system of field equations of the form (5.5.7) would be consistent with it. But the 
system (5.7.17) which finally won his assent is non-linear, so that the above 
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analysis does not apply to it. We cannot expect that every distribution of 
material sources moving in any arbitrary way is a possible realization of sucha 
system. At any rate, in the case of a closed material system, not exposed to 
freely specifiable outside influences, there is every reason to suppose that the 
worldlines of its elements will be severely constrained by their non-linear 
gravitational interaction under the Einstein field equations.’ It is then 
obviously worth asking whether those constraints are or are not compatible 
with the Geodesic Law. 

While an exact general solution to the problem of motion in General 
Relativity is not yet known and may even be impossible, many authors have 
dealt with it by approximation methods in a variety of cases. Under their 
hypotheses, which necessarily include restrictions on the form of the matter 
tensor, the gravitational field equations do actually determine, and not just 
constrain, the motion of freely falling matter. For example, in the absence of 
non-gravitational fields, matter consisting of gravity monopoles describes 
timelike geodesics of the metric determined by Einstein’s field equations. In the 
cases considered and within the limitations inherent in the chosen methods of 
proof the Geodesic Law is therefore a consequence of the Einstein equations. 
This remarkable result is linked to the fact, noted on pages 160 and 174, that 
the field equations entail the vanishing of the covariant divergence of the 
energy tensor of matter, or in other words, that equations (5.5.4) are a 
mathematical corollary of equations (5.5.10). 

In the 1913 “Outline” Einstein had already observed that the Geodesic Law 
of motion follows from equations (5.5.4) in the case of a continuous 
distribution of incoherent matter or dust.* W. Rindler offers the following 
proof of Einstein’s remark.* The energy tensor of dust satisfies equations 
(5.7.14), whence, by (5.5.4) 


(pU'U!). ; = Ul(pU4). + pU!, Ui =0 (5.8.1) 

Multiplying both sides of the second set of equations by U;(= g,,;U/) and 
summing over i, we obtain: 

(pU’), + pU,U', ,U/ = 0 (5.8.2) 


Let us calculate the value of the second term of (5.8.2) at any given worldpoint 
P. It is p(P) times the inner product of the local world velocity Up (with 
components U*|p) and the local world acceleration Ap (with components 
A'| p = Ut, ,U/| »—see p. 157), But the world velocity is everywhere orthogonal 
to the world acceleration (p. 106), whence the second term of (5.8.2) vanishes at 
P. Since P is arbitrary, 

(pU’)., =0 (5.8.3) 


Substituting from (5.8.3) into (5.8.1) we conclude that 
U'.,U/ =0 (5.8.4) 
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This means that the world acceleration field of a cloud of freely falling dust is 
everywhere zero. Consequently, by (5.5.3), each speck of dust describes a 
timelike geodesic. This exact and fairly simple derivation is especially 
significant because, as we shall see in Chapter 6, pressureless dust is generally 
regarded as a reasonable representation of matter in the very large during the 
present cosmic epoch. The Geodesic Law can be taken for granted in 
relativistic cosmology at least insofar as the dust world model holds good. 
In the fifth edition of Space-Time—Matter, published in 1923, Hermann 
Weyl attempted a derivation of the Geodesic Law of motion for an isolated 
system of material particles, without taking into consideration the form of the 
energy tensor in their interior.© A few years later, in a paper coauthored by 
J. Grommer, Einstein introduced a novel way of grappling with the problem.’ 
Convinced that in the absence of a satisfactory theory of matter every 
proposed specification of the energy tensor would be arbitrary, he treated the 
massive particles whose law of motion was the subject of inquiry as “moving 
singularities” in a gravitational field governed by the vacuum field equations 
(5.7.16). As in the classical field theories, a singularity is to be thought of as a 
worldpoint—a “moving singularity” as a worldline—where the functions 
characteristic of the field—in the present case, the metric components g,;;—are 
undefined or unsmooth. Einstein and Grommer intend to show that “the law 
of motion of singularities is fully determined by the field equations and the 
character of the singularities”. Now the notion of a singularity of the metric 
field is beset with difficulties of its own and cannot be simply taken for granted 
as a counterpart of the concept of a singularity of the classical gravitational 
and electromagnetic fields. Still guided by this analogy, Einstein himself wrote 
with Rosen in 1935 that “a singularity brings so much arbitrariness into the 
theory that it actually nullifies its laws”, and that “every field theory, in our 
opinion, must therefore adhere to the principle that singularities of the field are 
to be excluded”.? But never, to my knowledge, did he pronounce the one 
objection to singularities in General Relativity that is truly damaging to his 
treatment of particles in the Einstein-Grommer paper, namely, that a 
relativistic spacetime cannot include a point where the metric or the Riemann 
tensor are undefined without ceasing to be eo ipso a relativistic spacetime.'° In 
fact, Einstein persisted in treating particles as singularities in all his later work 
on the equations of motion, in collaboration with L. Infeld and B. Hoffmann 
(1938) and with L. Infeld alone (1940, 1949).!! In these papers, the two-body 
problem of General Relativity is solved by approximation methods under the 
restrictive hypothesis that the metric is almost flat, motion is slow, and the 
gravitational field is weak. Under such circumstances and with the margin of 
error allowed by the approximation employed, two particles of finite mass 
moving in vacuo across the gravitational field generated by them would each 
describe a timelike geodesic. The same basic approach is followed by Infeld 
and Schild (1949) in their much simpler derivation of the Geodesic Law for test 
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particles of negligible mass. The results thus obtained are somehow perplexing, 
for a geodesic maps its parameter onto a spacetime path along which the 
geometry is well-defined and cannot therefore be the locus of a “moving 
singularity”. To cope with this difficulty Infeld and Schild propose the 
following construction: 


A time-like world line Li) in a Riemannian 4-space R,g, with coordinates x* and metric 
tensor gio);(x*), which is analytic at all points of Li), represents a test particle if the 
following criterion applies: There exists a sequence of Riemannian 4-spaces R,wn) 
(N = 1,2,..., 00) with coordinates x*, with a world line L yy in each, and with a metric 
tensor g,nyij(x*) which, along L,y), has a singularity of the type representing a particle 
[described elsewhere in Infeld and Schild’s paper—R.T.], such that Limy _, 9 wyi(X") 
= 9,o)ij(") for all values of x* which in Rj) represent a point not on Li), and such that all 
points of L,y) do not tend to infinity as N-+ oo. It follows from this criterion that 
Limy .. ~ L(y) = Lo; this last relation to be interpreted by means of the point—point 
correspondence which exists between the spaces R,n), Rio) by virtue of their common 
coordinate system x*. The geodesic postulate requires that any Lio, representing a test 
particle be a geodesic in Rio). This statement, whether true or false, is meaningful, since g,o);; 
is analytic along Lio!” 


Yet surely the worldline iy), along which the g,y);; are singular, does not lie in 
the Riemannian 4-manifold R,jy), but rather in an extension of it. The authors 
apparently believe that if Riv) and Rig) — Lj) are mapped by the “common 
coordinate system x*” onto the same subset of R*, there is a unique extension 
of Riy) toa 4-manifold diffeomorphic with Ro). But this belief is illusory. Such 
an extension can be contrived in many ways, in some of which Ry) ts enriched 
by more than just a one-dimensional manifold diffeomorphic with Lo). It 
therefore makes little sense to speak about a definite sequence of worldlines 
Lay Lay... converging to Lig). 

A different approach to the problem of motion in General Relativity was 
tried by V. A. Fock.!? A particle is represented by a continuous matter 
distribution filling the interior of a very narrow tube in spacetime. The energy 
tensor vanishes, by hypothesis, outside such tubes, whereas inside them it takes 
a definite agreed form, e.g. that of the Laue tensor for a perfect fluid.'* A 
particle’s trajectory is represented by an “axis” of its respective tube, i.e. by a 
one-dimensional submanifold of the tube’s interior which meets every 
spacelike 3-dimensional section of the tube once and once only. It is shown by 
approximation methods that, on the strength of Einstein’s field equations, a 
non-spinning particle has a geodesic trajectory.'> But Fock’s approach is not 
exempt from difficulties and ambiguities and has not won general approval.'© 

P. Havas and J. N. Goldberg (1962) distinguish the laws of motion which 
relate the variables describing the trajectory of a particle to some unspecified 
force from the equations of motion which specify such a force in terms of fields 
or of the variables describing the particles that exert it. Typical of the former is 
Newton’s Second Law, F = ma; of the latter, the Newtonian Law of Gravity 
(1.7.1) or the Lorentz Force Law (2.2.2). In the spirit of this distinction Havas 
and Goldberg derive the Geodesic Law as a “law of motion” for an arbitrary 
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isolated system of n material particles gravitationally interacting in an 
unspecified metric field generated by them in agreement with the Einstein 
equations. The material system in question must consist of simple poles of the 
gravitational fields.’ Einstein’s field equations enter into the argument only 
insofar as they imply the vanishing of the covariant divergence of the energy 
tensor (equations (5.5.4) ). Thus, they entail the Geodesic Law of motion both 
in their original version (5.7.17) and in their modified form (D.6}—on page 281. 
Indeed any other geometric theory of the gravitational field that implies the 
energy conservation principle (5.5.4) would entail the same law of motion. 
Havas and Goldberg’s derivation presupposes that the unspecified metric is 
everywhere defined. As Havas has recently remarked, “no entirely rigorous 
proof has been given thus far that it is indeed possible to formulate an exact 
theory for interacting mass points for which [the metric components are | finite 
on the worldlines”.!® However, in their subsequent derivation of “equations of 
motion”, whereby they seek to specify the metric in successive approximations 
(with the Minkowski metric as the zeroth-order approximation), Havas and 
Goldberg (1962) and Havas (1979) manage to keep the metric components 
finite at any given stage of the procedure. 

In the last decade some important work has been done by W. G. Dixon and 
other researchers on the motion of extended bodies in General Relativity.*? 
They have shown that for any body represented by a timelike worldtube B, 
inside which the energy tensor of matter satisfies some definite positivity-of- 
energy requirement, there exists a unique center-of-mass worldline A contained 
in the convex hull of B. Having covariantly defined on 4, in terms of the metric 
and the energy tensor, a timelike linear momentum worldvector, an angular 
momentum bivector and a sequence of multipole moments, they prove that 
these objects together with the variables descriptive of 4 constitute a set of 
dynamic “body variables” satisfying a finite system of ordinary tensor 
differential equations. This, and the fact that, if the metric is specified, an 
energy tensor of matter satisfying the local conservation principle (5.5.4) can be 
fully recovered from the body variables, indicates according to J. Ehlers “that 
the body variables describe the body completely, and that the equations 
[between them] contain the same information as the local equation [5.5.4]”.”° 
Since Dixon’s argument depends only on the energy principle (5.5.4) and does 
not otherwise appeal to Einstein’s field equations, the metric is left unspecified. 
Hence, says Ehlers, the equations for the body variables may be described, in 
Havas and Goldberg’s terminology, as “laws of motion” for extended bodies. 
“They relate the evolution of the body variables to the gravitational potential 
[g] due (at least partly) to these bodies themselves.”?! “If g could be expressed 
as a functional of the body variables as a consequence of [the Einstein field 
equations] and boundary conditions and if these functionals were used to 
transform [the said “laws of motion”] into equations for the body variables 
alone, the latter would be ‘equations of motion’.”?? However, Ehlers suspects 
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that exact “equations of motion” in this sense do not exist in General Relativity 
and that they can be found as approximations only. 

Some advances in the theory of motion in General Relativity might be 
attained through the study of the temporal development of geometry from 
initial data, as pursued in connection with the Cauchy problem and the so- 
called dynamic formulation of the theory. If a spacetime chart x is specified 
and the respective metric components g;; and their time derivatives 0g, ;/0x° 
are given on a spacelike hypersurface S—say, on the 3-manifold x° = 0-—the 
Einstein field equations determine the values of the g; ; on the entire domain of 
dependence D(S) of the said hypersurface.” But if the g; are known at every 
point ofan open submanifold of spacetime the local values of the energy tensor 
components T;; can be figured out directly from the field equations. 
Consequently, the evolution of the geometry to and from the instantaneous 
spacetime slice S prescribes the motion of matter in the interior of D(S). It is 
doubtful, however, that this method can be used for predicting the outcome of 
actual physical experiments.?* 

In the discussion following Einstein’s Vienna lecture of 1913, Max Born 
asked him how fast gravitational effects would propagate according to the 
Einstein~Grossmann theory. Einstein replied: 


It is extraordinarily simple to write the equations for the case in which the disturbances 
introduced into the field are infinitely small. In that case, the [potentials g, ,] differ infinitely 
little from what they would be without such disturbances. The disturbances propagate then 
with the same speed as light.?* 
Einstein took this approach to “the important question as to how the 
propagation of gravitational fields take place”?® when he tackled it, a few years 
later, in the context of General Relativity. Assuming an almost flat metric, 
linked by the field equations (5.7.17) to an infinitely weak energy tensor, he was 
able to show that a small local change in the distribution of matter would bring 
about a change in the metric propagating throughout the universe with the 
speed of light.?” However, it was not until much later that the question of 
gravitational radiation in General Relativity was satisfactorily dealt with 
under less restrictive assumptions.?® Although the subject is fraught with 
conceptual, mathematical and experimental difficulties and does not seem to 
be promising from a technological point of view, it has aroused considerable 
interest, for, as Steven Weinberg points out, “the theory of gravitational 
radiation provides a crucial link between general relativity and the microscopic 
frontier of physics”.?? 

The problem of the equations of motion in General Relativity is closely 
connected with the problem of gravitational radiation.* This may be seen by 
analogy with the situation in electromagnetic theory. In that theory, conserv- 
ation of the combined energy tensor for the field and its material sources 


* The rest of this Section, until page 185, was especially written by Professor John Stachel, for 
insertion in this place. (R.T.) 
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(equations (4.5.19)) must be postulated in addition to the Maxwell equations 
for the field; while in General Relativity the generally covariant conservation 
laws for the energy tensor follow from the field equations. But in both cases, 
these conservation laws imply some connection between the behaviour of the 
field far from a system of sources and the motion of these sources. In the 
electromagnetic case, for example, one may derive the force law for a charged 
monopole particle from the conservation laws by a variety of techniques, such 
as those based on the work of Dirac (1938). With the assumption of retarded 
potential interactions between the sources, the force equation takes the form 
of the Lorentz force law plus a radiation reaction term. As a consequence of the 
radiation reaction term an isolated system of charges will lose energy as a result 
of their electromagnetic interaction. On the other hand, integration of the 
electromagnetic energy-momentum flux over the “sphere at infinity” indicates 
that electromagnetic radiation is carrying off energy to an extent that just 
balances the energy lost by the moving sources. These general results are 
supported by the existence of some exact solutions for radiating electro- 
magnetic fields including their sources. The earliest and best known is the 
Hertz oscillating dipole solution. In this case, the system is not closed, and 
what one actually verifies is that the energy fed into the system to keep the 
oscillator in a steady state just balances the energy radiated away by the field 
produced by the oscillator. What one would like in General Relativity are some 
analogous results for gravitating systems. But so far no one has been able to 
prove any comparable exact results, and even the approximate calculations 
that have been done are not yet fully satisfactory to all workers in the field.*° 

First of all, no exact solution is known representing an isolated gravitating 
system in a state of motion that would produce gravitational radiation. The 
problem in the case of gravity is much more difficult than in that of 
electromagnetism. In the latter case, non-electromagnetic forces may be used 
to produce and sustain the motion of charged particles which results in 
electromagnetic radiation. Thus, open electromagnetic systems are perfectly 
acceptable. But in the case of gravity, anything non-gravitational that is 
capable of affecting the motion of the sources must itself have an energy tensor 
and should therefore properly be included in the total energy tensor producing 
the gravitational field in question. So only closed systems could provide really 
satisfactory models of sources plus radiation fields. This means that no steady- 
state radiative solution can be satisfactory. 

An additional problem in General Relativity concerns the definition of 
energy and momentum for the gravitational field. In a Lorentz-invariant 
theory, such as electrodynamics, the role of energy and momentum as 
generators of temporal and spatial translations serves to define the com- 
ponents of an energy-momentum 4-vector, which is an integral of the 
appropriate components of the energy tensor over a spacelike hypersurface. 
Since the typical relativistic spacetime has no symmetries, (p. 154), no such 
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procedure may be used. For asymptotically flat spacetimes corresponding to 
the gravitational fields produced by isolated sources, asymptotic symmetries 
exist allowing the definition of global quantities which can be identified as the 
total energy and momentum of the system. In the past, such quantities were 
usually defined in terms of integrals over spacelike hypersurfaces of various 
non-tensorial quantities—the so-called gravitational energy pseudotensors 
(like the energy matrix defined by (5.7.8)). For asymptotically stationary 
spacetimes the resulting global quantities could be shown to be invariant 
under a sufficiently restricted class of coordinate transformations. But the 
analysis of such solutions contributed nothing to the radiation problem, since 
the energy so defined was constant in time. The use of such gravitational 
energy pseudotensors in the treatment of radiation problems remained 
therefore somewhat dubious. 

However, great progress has been made in the last 15 years in the analysis of 
the exact asymptotic structure of the gravitational fields of isolated sources. 
(Asymptotic structure here means the structure of the field far from its 
sources.) This concept can be made precise, and indeed by the addition of a 
boundary to the spacetime manifold it has been possible to define past (and 
future) null infinities for such systems. That is, the ideal points, defined by past 
(future) null geodesics of manifolds corresponding to suitably well-behaved 
solutions of the vacuum Einstein equations (5.7.16), themselves form 3- 
manifolds which may be attached as boundaries to the 4-manifolds on which 
the field equations are respectively satisfied.?! In a suitably defined conformal 
sense, the said 3-manifolds are null hypersurfaces of the manifold-with- 
boundary.*? This technique enables one to define a pair of so-called “news 
functions” on future null infinity, whose non-vanishing is a criterion for the 
existence of gravitational radiation. A mass-loss formula then relates these 
news functions to a “mass function”, whose integral over the sphere at infinity 
is monotonically decreasing whenever the news functions are non-vanishing. 
(The two news functions correspond to the two degrees of freedom of the 
gravitational field, comparable to the two independent states of polarization of 
the electromagnetic field.) The great remaining problem is to connect these 
exact but asymptotic results with some aspects of the “near field” behaviour of 
the presumed sources of those “far fields”. That there is some connection seems 
clear. For static spacetimes, for example, the integral of the mass function 
corresponds to the well-defined “Schwarzschild mass” of the solution.*? Some 
linearized approximation results also exist which seem to indicate that the 
news functions are connected with the changing multipole structure of the field 
sources in a way reminiscent of the corresponding electromagnetic results.>* 

The linearized expansion technique is based upon the assumption that the 
gravitational field may be taken to differ only slightly from flat, Minkowski 
spacetime.** The first-order linearized field already gives an excellent approxi- 
mation to the behaviour of the gravitational radiation field far from its sources. 
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The major problem with any attempt to approach the problem of the relation 
between gravitational radiation and motion in terms of the linearized 
approximation technique is that the linearized gravitational field equations 
themselves imply an integrability condition on their sources, namely, the 
special-relativistic conservation law (4.5.20) for the energy tensor. In the 
absence of any non-gravitational interactions, this means that monopole 
sources, for example, must move along straight lines in Minkowski spacetime 
without interacting. If there is any linearized gravitational radiation present, it 
cannot have been produced by free gravitational motion of linearized sources; 
nor can the radiation even interact with linearized matter in a linearized theory. 
Thus, linearized gravitational theory cannot be consistently applied to the 
problem at hand; at best it might form the starting point for a complicated 
higher-order approximation procedure. Attempts to carry out such approxi- 
mations based on the linearized theory have so far not yielded any conclusive 
results. 

The other major approximation technique which has been applied to this 
problem is the so-called new approximation or EIH method (after Einstein, 
Infeld and Hoffmann (1938), although it actually goes back to Lorentz and 
Droste). As noted on page 178, the EIH technique is based upon the 
assumption that all motions of sources are slow and all gravitational fields are 
weak. Starting from the Newtonian gravitational field as the lowest order 
approximation, it gives an excellent description of post-Newtonian and even 
post-post-Newtonian corrections to the motion of such slow, weak sources 
of the gravitational field. No effects of gravitational radiation occur up to 
this order of approximation. Attempts to apply the technique directly to 
the radiation problem result in horrendous calculations, since radiation effects 
do not arise until such a very high order in the approximation scheme. Indeed, 
the very assumption of slow motions and weak fields makes the method ill- 
suited to the radiation problem. Attempts to use this method have accordingly 
also failed to produce any conclusive results. 

However, a method has been developed which enables one to combine the 
advantages of the EIH technique for treating the motion of the sources with 
those of the linearized technique for treating the distant radiation field.2° The 
EIH technique is used for the near zone, where the sources are located, 
resulting in (elliptic) Poisson-type equations relating the metric components to 
the components of the energy tensor of the sources. The linearized technique is 
applied in the far zone, many wavelengths away from the sources, resulting in 
(hyperbolic) wave equations for the propagation of the metric field com- 
ponents describing the radiation field. Relating the solutions in the two zones 
requires the use of singular perturbation techniques. The method of matched 
asymptotic expansions is used to match the two solutions in an intermediate 
zone. This enables the radiation loss in the far zone, for an assumed outgoing 
wave solution, to be matched with secular “radiation reaction” effects on the 
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motion of the sources in the near zone. The method is known to give 
reasonable results in the electromagnetic case, where some problems can also 
be independently solved by other methods, thus allowing for comparison. In 
the gravitational case, no such independent procedures for evaluating the 
results are available; nor, indeed, is there any proof that the “approximate 
solutions” obtained really approximate anything like an exact solution. 
However, with these cautions, the method of matched asymptotic expansions 
may be said to furnish the most reliable approximation results so far obtained 
for this important problem. 


CHAPTER 6 


Gravitational Geometry 


6.1 Structures of Spacetime 


The development of General Relativity has been one of the great intellectual 
adventures of our time. The story of the theory’s reception and growth, its 
lasting successes and apparent failures would fit well in this place, but would 
also lengthen the book considerably and go beyond its intended scope. 
Moreover, the story cannot be told properly until much more detailed 
scholarly research has been carried out.! Therefore I shall not try to follow 
here the same method of historical reconstruction as in the preceding chapter. 
Instead I shall elucidate a few matters that I consider especially significant 
from a philosophical point of view. I begin with some mathematical ideas 
which greatly clarify the meaning of the hypothesis that physical becoming is 
staged in a relativistic spacetime. 

Let us recall our twofold approach to Minkowski spacetime in Section 4.2. 
We regarded it as the 4-manifold &—diffeomorphic with R*—endowed with 
the flat Minkowski metric 4, and also as the affine metric space arising from the 
transitive and effective action of the additive group of R*—endowed with the 
Minkowski inner product <a, b> = (agbo —a,b,)—on the unstructured set of 
worldpoints. We saw that both views were equivalent inasmuch as the 
differentiable structure of .@ could be obtained from the action, the straights 
of the affine space were the ranges of Riemannian geodesics of (.#, 9) and the 
interval assigned to each pair of worldpoints by the affine metric was measured 
by the energy of the Riemannian geodesic joining them (p. 95). That the affine 
metric induced by the Minkowski inner product should agree with 4 is not 
surprising, but the perfect agreement of the affine straights with the geodesics 
of (4, 9) is quite remarkable. For the concept of a Riemannian geodesic was 
defined in terms of the metric (p. 94), while the straights of an affine space are 
generated by the action of the associated vector space independently of any 
inner product that could be defined on the latter (p. 91). This suggests that we 
should be able to produce a non-metrical characterization of the geodesics of 
(4,1). In order to do so, let us look on Minkowski’s world .@ as the 4- 
manifold generated by the action of R* on the set of worldpoints (p. 92). 
Relative to this manifold structure the action of R* is smooth, and the additive 
group of R* acts on @ as a Lie group of transformations (p. 26). A manifold 
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constructed in this way from an affine space may be called an affine manifold. 
The action of R* on .@ induces a partition of the tangent bundle of .@ into 
equivalence classes, as follows. Each te R* determines a translation of . by 
PtP. We say that two vectors tangent to.# are parallel if one is the image of 
the other by the differential of a translation.” The reader will easily verify that 
vector parallelism, thus defined, is indeed an equivalence. Let us now bestow 
the Minkowski metric 4 on the affine manifold .# in such a way that each 
translation turns out to be an isometry (.e., so that t*4 = 4, for each translation 
t). The metric is then said to be consistent with the action of R*. A curve y in.# 
is a Riemannian geodesic of (.4, 9) if and only if all its tangents (p. 259) are 
mutually parallel. We can thus give the following non-metric characterization 
of geodesics: a curve y in the affine manifold .@ is an (affine) geodesic if and 
only if y is an integral curve (p. 260) of a field of parallel vectors. The affine 
geodesics of the affine manifold become the Riemannian geodesics of the 
Riemannian manifold if the metric is so defined that translations become 
isometries. 

Let us turn now to the more or less arbitrary relativistic spacetime (W, g) 
countenanced by General Relativity. R* can act transitively on W as a Lie 
group of translations if and only if W admits a global chart, which of course 
would entail that W is a copy of Minkowski’s world.? But even in the 
exceptional case in which this is so, no such action of R* on can be consistent 
with the metric g, in the sense explained above, unless the latter is flat. 
Consequently our concept of vector parallelism is generally inapplicable to 
(W, g). Nevertheless, by a masterly extension of this concept, Tullio Levi-Civita 
(1917) provided a non-metric characterization of the geodesics of any 
Riemannian manifold.* 

Levi-Civita’s idea is beautifully simple, as the following consideration— 
made in his treatise on The Absolute Differential Calculus—will show. Let S be 
a smooth surface embedded in Euclidean 3-space. (Any 2-manifold can be 
constructed by pasting together a collection of such surfaces.) If S is 
developable onto a plane, like a cylinder or a cone, there is an isometric map /f of 
S into R?. Since R? is an affine manifold in an obvious way, our earlier concept 
of vector parallel is applicable to S. (Two vectors v and w are parallel in S iff, v 
and f, w are parallel in R?.) Assume therefore that S is non-developable and let 
tt+yt be a smooth curve in S, joining two not necessarily distinct points P and 
Q. As t ranges over the interval on which our curve is defined, the plane tangent 
to S at yt generally changes. We thus obtain a family of planes { 2,}, indexed by 
the parametre t. The family is enveloped by a smooth developable surface S’ 
which touches S along the range of y.® Each vector tangent to S at a point of y 
can be naturally regarded as a vector tangent to S’. We say that two vectors v 
and w, tangent to S at points of y, are parallel along y in S if v and ware parallel 
in S’. Vector parallelism in non-developable surfaces is essentially relative to 
the curve—or rather to the path of the curve—along which it is said to hold. If 
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Pand Qareany points of S joined by two curves 7, and y,,a vector at Q parallel 
to a given vector at P along 7, is generally not parallel to it along y,. This 
statement is true even if P = Q and y, and y, are two loops. 

Levi-Civita defines vector parallelism along a curve in an arbitrary 
Riemannian n-manifold by a straightforward generalization to n dimensions 
of the foregoing notion of parallelism in a surface. He uses the fact that any 
such n-manifold is locally isometrically embeddable into a flat m-manifold for 
sufficiently large m.’ But, as he points out, the concept defined is intrinsic, that 
is, independent of any such embedding. Levi-Civita establishes the following 
analytic criterion of parallelism. Let us say that a vector field V on the 
Riemannian n-manifold ( M, g) is parallel along a curve y in M if all values of V 
at points of y are mutually parallel along y. Let x be a chart defined on a 
neighbourhood of the points of y and let n° scalar fields I'j, be defined on the 
domain of x, in terms of the components of g relative to x, by equations (D.1) 
on page 281. The vector field V = V‘d/0x' is parallel along y if and only if 


dvi-y dx/-y 
dt dt 


Since the I’, depend only on the metric g and not ona particular embedding of 
M into a manifold of higher dimension number, equations (6.1.1) constitute an 
intrinsic definition of parallelism along y. Moreover, they will provide a non- 
metric definition of parallelism along y if, following Hermann Weyl (1918a) we 
make the 'j, independent of the g;;. This can be done as follows. Because the 
metric g is a (0, 2)-tensor field, two arrays Ij, and I’), defined by (D.1) relative 
to two different charts x and x’ will satisfy, on the intersection of the domains 
of x and x’, equations (C.4.3) of p. 273. Suppose now that on the domain of 
each chart of an n-manifold M we define an array of n° scalar fields Ti, subject 
to the “transformation law” (C.4.3). If the mutual dependences thus defined are 
smooth we have bestowed on M a “linear connection” ’, whose “components” 
relative to each chart of M are the respective I'},. If the fields of each array 
satisfy the condition I, = I,;, we say that Fis symmetric. T is said to endow 
M with an “affine structure” (though it does not make M into an affine space in 
the standard sense of this expression). 

A Riemannian metric g on the n-manifold M evidently bestows on it the 
unique linear connection defined by equations (D.1). This is known as the Levi- 
Civita connection, and is the only symmetric connection that is consistent with g 
in the following sense: If two vectors v and w ata point Pe M are congruent or 
orthogonal (ie. if gp(v, v) = gp(w, w), or if gp(v, w) = 0), the vectors parallel 
to vand watany Qe M alongany curve joining P and Q are also congruent—in 
the former case—or orthogonal—in the latter. Since a given symmetric 
connection can stand in this special relation to several distinct metrics, the 
affine structure of a relativistic spacetime, defined by its Levi-Civita connec- 
tion, is a more general and so to speak a “deeper” structure than its metric. This 
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is important, for all the main notions of Einstein’s theory of gravity are linked 
directly to the connection. Thus, the “gravitational field strengths” (pp. 151 and 
317, n. 20) are just the connection components. The operation of covariant 
differentiation through which, according to Einstein, the influence of the 
gravitational field on physical processes is made explicit depends on the 
connection alone and can be defined in terms of it by equations (5.5.1). Also 
the spacetime curvature, as embodied in the (1, 3) Riemann tensor, and the 
Ricci tensor, that occurs in Einstein’s field equations, can be conceived purely 
in terms of the connection. (See equations (C.6.9) on p. 277. The Ricci tensor is 
obtained by substituting / for ion these equations and summing, as usual, over 
repeated indices. Observe that the Ricci tensor thus derived from an arbitrary 
symmetric linear connection need not be symmetric, though indeed it will be so 
if the connection is related to a Riemannian metric by (D.1) i.e. if it is a Levi- 
Civita connection.) Let us say hereafter that a curve y ina manifold M endowed 
with a linear connection I is a geodesic of T if its tangent vectors are mutually 
parallel along it—or, as one often says, if y is “self-parallel”. y is then an integral 
curve of a vector field whose values on y’s range are parallel along it. It turns 
out that the Riemannian geodesics of a Riemannian manifold are precisely the 
geodesics of its Levi-Civita connection.® 

Weyl’s definition of a linear connection as so many arrays of scalar fields 
transforming into one another according to (C.4.3) is unimpeachable if, as the 
saying goes, a mathematical entity is just what the mathematician chooses it to 
be—neither more nor less. But non-mathematicians like myself, who do not 
revel in the metamorphoses of numbers and have been stunned by the 
suggestion that gravity is just the visible expression of the linear connection of 
the world, are comforted by the knowledge that other mathematicians, 
working out some insights of Elie Cartan, have produced a more perspicuous 
definition. I explain it in Appendix C. Though it is not necessary to study it in 
order to understand what follows, all that is hereafter said about linear 
connections applies to the concept developed there—which, by the way, is 
extensionally equivalent to Weyl’s (p. 273). One interesting consequence of the 
said definition, which may be worth noting here, is that parallelism along 
curves in an n-manifold M is directly linked to and as it were the reflection 
of the existence of an absolute parallelism in the bundle of n-ads over M 
(p. 273). 

The idea of path-dependent vector parallelism inspired Hermann Weyl’s 
attempt at a unified, purely geometric theory of gravitation and electricity. In 
this theory, “all effective reality in the world is a manifestation of the world 
metric”. The latter, however, is not Riemannian, but is conceived after Weyl’s 
idea of a “pure infinitesimal geometry”, a genuine “geometry of proximity” 
(Nahe-Geometrie), in which separate geometrical objects can only be compared 
through intermediaries lying in between. In a multidimensional manifold such 
mediated comparisons depend on the path along which the intermediaries are 
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placed. Hence, in Weyl’s geometry—as opposed to Euclid’s—there are no 
absolute relations between distant objects, and vector congruence, no less than 
vector parallelism, is path-dependent. Applied to the real world, this means 
that two units of length or time (e.g. two wavelengths, or two wave periods, 
instanced by appropriate standard devices) will not generally agree every time 
they meet unless they have gone along nearby worldlines from one encounter 
to the next.!° Specifically, Weyl’s “world metric” depends on a symmetric, non- 
degenerate (0,2) tensor field g—in effect, a Riemannian metric—and a 
covector field ~, which jointly govern vector congruence as follows. Let x bea 
chart defined at a point P and let P’ be a worldpoint infinitely close to P such 
that x'( P’) = x'( P)+6x'. Let p; be the components, relative to x, of g at P, 
and let g,, and g,,+ 6g,; be those of g at P and P, respectively. Consider a 
vector at P and another vector at P’, whose components relative to x are 
respectively V' and W'‘. The two vectors are congruent if and only if 


gijV'V) = (1+, 5x") (g;; + 6g;;)W'W! (6.1.2) 


This criterion provides the means of judging the congruence of vectors at 
separate points, relative to a specific path joining them. Evidently, congruence 
is not affected if @ —d0 and e’g are substituted for @ and g, respectively, with 6 
an arbitrary scalar field. As in Einstein’s theory, gravitational phenomena are 
governed by a linear connection coupled to the world metric in the following 
way: if v and w are congruent vectors at a given point, the vectors parallel to v 
and w at any other point along any given curve are also congruent along that 
particular curve. This implies that the connection components Ij, relative to 
the chart x introduced above must satisfy the following conditions: 
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(Note that (6.1.3) becomes (D.1) if @ 1s identically 0.) The linear connection is 
thus fully determined by g and @. Weyl assumes that the covector field @ is 
none other than the electromagnetic potential.'! 

The possibility of disregarding metrics inherent in the concept of linear 
connection was first fully utilized in Eddington’s unified theory of gravity and 
electricity, with which Einstein sympathized for a short time.'? Spacetime 
geometry rests here on a symmetric linear connection, whose 40 independent 
components are to be the unknowns of the field equations. As I noted on 
p. 189, the Ricci tensor defined by this connection need not be symmetric, but 
may be regarded, like every (0, 2)-tensor field, as the sum of a symmetric and an 
antisymmetric tensor field.!* The theory equates the former with the spacetime 
metric, the latter with the electromagnetic field tensor.'* 

The linear connection of spacetime is also the key notion of the unified 
theory of gravity and electricity introduced by Einstein himself in the late 
twenties.'° This theory postulates that spacetime has a field of orthonormal 
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tetrads (pp. 99, and 303, note 1). Such a field determines a canonic iso- 
morphism between any two tangent spaces, whence vector parallelism holds 
absolutely, irrespective of any “path of comparison” (we have Fern- 
parallelismus—“parallelism at a distance”—as Einstein calls it), even though 
the Riemannian metric implicit in the assumption that the tetrad field is 
orthonormal is not held to be Minkowskian, but only Lorentzian. As Cartan 
pointed out to Einstein, what all this amounts to is simply that spacetime has a 
linear connection which is flat (zero curvature) but generally non-symmetric 
(non-zero torsion).'© A non-symmetric linear connection is also a character- 
istic of Einstein’s final theory, the Relativistic Theory of the Non-Symmetric 
Field.’’ 

The demetricized concept of linear connection enabled Cartan to produce 
the geometrical formulation of Newton’s theory of gravity mentioned on page 
34. The empty Newtonian spacetime described in Section 1.6 can be 
characterized by means of a flat symmetric linear connection I’, a scalar field t 
and a symmetric (2, 0)-tensor field g on Minkowski’s world .4. t—which we 
may regard as determined only up to a linear transformation tat + B 
(a, BER, « # 0)}—assigns a time to each worldpoint. The fibre of t over each of 
its values is a 3-manifold diffeomorphic to R*. g satisfies the condition 
g (dt, dt) = 0, and induces a Euclidean metric on each 3-manifold t = const. A 
curve y in .@ is “timelike” (in the sense of p. 27) if y 0 on the entire range 
of y (cf. p. 259). A free test particle describes a timelike geodesic of the 
connection I. Let x bea global chart of .@. We assume that (dt/0x'), ; and g', 
vanish identically (where the g‘/ are the components of g relative to x and the 
semicolon indicates covariant differentiation in (#,I°)). This completes the 
characterization of empty Newtonian spacetime. If matter is present in 
significant amounts our test particles come under the influence of gravity. This 
is accounted for by postulating that spacetime is “warped” by matter, i.e. that 
the connection TF is no longer flat. Freely falling test particles still obey the 
geodesic law. A few more assumptions must be added concerning I and its 
linkage to g and 1.'® But it is time to stop this digression and return to the 
consideration of General Relativity. 

Just as many Riemannian metrics on a given manifold can share the same 
Levi-Civita connection, so many linear connections will define the same set of 
geodesics (p. 278). The selection of one such set—or of the class of linear 
connections that agree on it—endows a manifold with a particular structure, 
which Weyl called projective. (Although it does not make the manifold into a 
projective space!) It is a trivial, though not insignificant remark, that the 
possible worldlines of the simplest (non-spinning, monopole) freely falling 
particles in General Relativity are determined by the projective structure of 
spacetime, regardless of its affine and metric structures—though, of course, 
when the latter structures are given, the projective structure is defined 
uniquely. 
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Another structure underlying every Riemannian manifold is the so-called 
conformal structure. Two Riemannian metrics g and g’ on a manifold M are 
said to be conformal if g' = 47g, where 4 is an arbitrary scalar field on M. 
Conformal metrics obviously have the same signature. If they are proper (i.e. 
positive or negative definite), they all assign the same value to the angle 
between two given vectors at a point.'? Thus, angle measure pertains to the 
conformal structure of a proper Riemannian manifold, rather than to the full 
metric structure. If, on the other hand, the conformal metrics under 
consideration are indefinite, they all determine the same partition of the 
tangent bundle into spacelike, timelike and null vectors.?° In either case, 
conformal metrics agree in their definition of orthogonality.?! Let us say that 
a worldpoint P in the relativistic spacetime (Wg) causally precedes the 
worldpoint Q (P < Q) if information can flow from P to Q (cf. p. 121). Due to 
the local validity of Special Relativity, it is clear that P < Q only if P = Q or 
there is a nowhere spacelike curve joining P to Q. Let us stipulate further that P 
chronologically precedes Q(P <Q) if P <Q and P and Q are joined by a 
timelike curve. It is not hard to verify that (W, <, <) satisfies conditions 
(i}-(vi) in the definition of causal spaces on page 123. If, moreover, (W, g) does 
not contain any closed nowhere spacelike curves, (W, <, <) also satisfies 
condition (vii), so that the spacetime (W, g) has a causal structure determined 
by the metric g. Evidently, two metrics g and g’ bestow the same causal 
structure on the spacetime manifold if and only if they are conformal. (It is 
worth noting that in any relativistic spacetime each worldpoint P has a 
neighbourhood U such that (U, <, <}--with < and < defined as above but 
restricted to U—is a causal space.) 

Let P be a projective and C a conformal structure bestowed on an 
n-manifold M. We may equate P and C, respectively, with the class of linear 
connections and the class of conformal Riemannian metrics that define them. 
Let us assume that the metrics of C have signature — 2. (We may then say that 
C is a “Lorentzian conformal structure”.) A metric g of C is compatible with a 
linear connection I’ of P if vector parallelism as defined by IT preserves 
congruence and orthogonality as defined by g, in the manner explained on 
page 188 (whence, if ’ is symmetric, it must be the Levi-Civita connection of 
(M, g)). Let us say that C itself is compatible with ['eP if I-parallelism 
preserves C-orthogonality. We can also think ofa more basic, direct agreement 
between P and C. A hypersurface H in (M,C) is said to be null if it is 
everywhere tangent to the local cone of null vectors. (At each Pe H, the tangent 
space H, meets but does not cut the past null cone.) We shall say that P is 
compatible with C if every P-geodesic which is C-null lies on a C-null 
hypersurface. If P is compatible with C, C is compatible with one and only one 
symmetric linear connection of P. If this connection has a symmetric Ricci 
tensor there is a Riemannian metric of C, unique up to a constant scale factor, 
that is compatible with it. In this way, the conformal structure C and the 
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projective structure P of a relativistic spacetime jointly determine its metric. 

This theorem was proved 60 years ago by Hermann Weyl, who at once drew 
the inference that “the world metric can be established (festgelegt) solely by the 
observation of the ‘natural’ motion of material particles and the propagation 
of physical effects, especially of light”, without resorting to rods and clocks.?? 
Curiously enough, this remark and the mathematical result that prompted it 
were persistently overlooked by philosophers—although it is clearly relevant, 
say, to Bridgman’s complaint that General Relativity is devoid of operational 
meaning, or to Reichenbach’s claim that spacetime geometry is conventional— 
until it gave rise, in the early seventies, to the fascinating axiomatic formulation 
of General Relativity put forward by J. Ehlers, F. A. E. Pirani and A. Schild as 
“The Geometry of Free Fall and Light Propagation”. Just as Helmholtz tried 
to base ordinary space geometry on the (suitably idealized) facts of (practically) 
rigid motion, Ehlers, Pirani and Schild “attempt to derive the conformal, 
projective, affine, and metric structures of space-time from some qualitative 
(incidence and differential-topological) properties of the phenomena of light 
propagation and free fall that are strongly suggested by experience.”* Indeed, 
our authors go even further, for instead of taking the manifold structure of the 
world for granted, as Weyl did, they construct it from a few assumptions 
regarding particles and their interactions. While it would take us too long to 
explain their entire axiom system, we can afford to take a look at this 
interesting construction.?* 

Let W stand for the unstructured set of worldpoints. By a “particle” I mean a 
subset of W which could be the spacetime trajectory (the life history) of a freely 
falling material particle. By a “signal” I mean a subset of W which could be the 
spacetime trajectory of a light-signal in vacuo. Let us postulate: 


I. A particle is an oriented 1-manifold, diffeomorphic with R. 


The orientation of a particle is meant to reflect the time order of events. If aand 
bare particles, an echo on a from bis a mapping f of a into a, such that for each 
Pea, P precedes f ( P) and there are two signals, op and op, which meet a at P 
and at f(P), respectively, and intersect at a point of b. An echo f on a from b 
obviously determines a mapping of a into b by P+> ap Nayp. Such a mapping 
is called a message from a to b. We postulate further: 


II. Any message from a particle to another is smooth. Any echo on a 
particle from another is a diffeomorphism. 


Let U < V cW. Let aand b be two particles. Suppose that each Pe U is the 
meeting point of exactly two signals that meet a inside V and exactly two 
signals that meet b inside V. Denote the intersections of a and b with the said 
signals through P by P} and P2, P; and P}, respectively, reserving the index 1 
for the earlier intersection. Choose a chart « of a and a chart f of b. U is 
mapped into R* by P++(a(P}), a(P2), B(P}), B(P?)). If this mapping is 
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injective and its range is open in R*, we call it a radar chart, defined on U, 
relative to V, and based on a and Bb. We finally postulate: 


Ill. Each PeW lies on the domain of a radar chart, relative to an 
appropriate subset of Wand based on a suitable pair of particles. 
The collection of all such radar charts is a smooth atlas.?° 


The atlas of radar charts defines a topology and a differentiable structure in W, 
in the manner explained on page 348, note 3 to Appendix A.?° 


6.2 Mach’s Principle and the Advent of Relativistic Cosmology 


The linear connection of Minkowski spacetime accounts for the essential 
difference between inertial and accelerated motion in Special Relativity. An 
object that moves inertially sticks to its spacetime direction and describes a 
geodesic; an accelerated object departs incessantly from the geodesic trajectory 
of its mirf (p. 96). While the former’s worldvelocity remains parallel to itself 
along the object’s worldline, the latter’s is being continually deviated by 
extraneous forces. The same account can be extended to Newtonian dynamics 
by means of the four-dimensional formulation sketched in Section 1.6. In 
either theory, Foucault’s pendulum holds aloof from the whirling earth and 
faithfully sweeps a plane fixed in an inertial system because it follows the 
dictates of the spacetime geometry. Thus, Minkowski’s vision, perfected by the 
work of Levi-Civita and Weyl, at last provides a truly geometric understanding 
of the strains and stresses that show up in accelerated systems, which Newton 
had attributed, obscurely yet insightfully, to the peculiar relation of those 
systems to absolute space.' But, although chronogeometry sheds light on 
Newton’s explanation of such phenomena, it does not make it a whit less 
uncanny. Newtonian dynamics and Special Relativity put us in the grip of a 
universal structure, capable of shattering to pieces a body that resists its sway, 
and yet impervious to the body’s reaction. In this respect, both theories 
contrast sharply with General Relativity. In the latter, again, geodesic motion, 
as in free fall, differs essentially from non-geodesic motion, as illustrated, say, 
by “rest” in a gravitational field. There can be no question of dumping all 
motions together into a single equivalence class, as the expression “general 
relativity” might suggest. But the linear connection responsible for their 
differences is not itself indifferent to the whereabouts of the bodies subject to 
its guidance, but is governed, through the Einstein field equations, by the 
actual distribution of matter. As Hermann Weyl eloquently said: 


If the guidance field [i.e. the linear connection—R.T.] manifests itself in mechanical 
processes as an effective power of eventually shattering might, at war with the forces of 
nature, we must regard this field as something real. It cannot be—as Newtonian mechanics 
and the Special Theory of Relativity assumed—a formal, unconditionally given structure of 
the world, independent of matter and its states, but must in turn suffer under the action of 
matter, be determined by matter, and change as it changes.” 
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The assimilation of gravity and inertia by the Equivalence Principle thus leads 
not only to a geometrical conception of gravity in the likeness of inertia, but 
also to a dynamical conception of inertia in the likeness of gravity. Einstein 
repeatedly described this feature of General Relativity as its chief advantage 
over its predecessors; most emphatically perhaps in a letter to Max Born of 
May 12th, 1952, in which he comments on an apparent failure of one of the 
three classical tests of the theory: 


Even if no deviation of light, no perihelion advance and no shift of spectral lines were known 
at all, the gravitational field equations would be convincing, because they avoid the inertial 
system—that ghost which acts on everything but on which things do not react.? 


The field equations equate the Einstein tensor, characteristic of the spacetime 
geometry, with the energy tensor that specifies the matter distribution. Since 
equality is a symmetrical relation, neither term of which can as such claim 
priority over the other, the field equations bespeak the mutual determination 
of the metric and the matter fields. Their interaction is what puts flesh and 
blood into the otherwise ghostlike geometry. However, Einstein’s earlier 
pronouncements on this question demand that the metric field be completely 
and as it were unilaterally determined by matter. This demand, of course, goes 
beyond what the gravitational field equations can express. Its motivation is 
philosophical, as Einstein clearly explained to Gustav Mie, in two letters 
written in February 1918. In the first of them, dated February 8th, he gives the 
following statement of “the standpoint that I call relativistic”: 


The behaviour (continuation of the temporal evolution of states) of every natural body is 
such, that it is univocally determined by its own state and that of all the other bodies. (The 
statement must naturally be interpreted without instantaneous action at a distance.) By 
“natural bodies” we must understand here only the perceivable bodies of our sensible world 
(die wahrnehmbaren Kérper unserer Sinnenwelt).* 


In the second letter, dated February 22nd, he asserts that the “relativistic 
standpoint” prescribes the following conception of inertia: 


If Lis the actual trajectory of a certain freely moving body and L’ is a trajectory that deviates 
from it but has the same initial conditions, the relativistic conception demands that the 
trajectory L which is in fact followed be preferred to the logically equally possible L’ due toa 
real cause. [ ... ] Such a real cause can only consist in the relative positions and states of 
motion of all the other bodies that exist in the world. The inertial behaviour of our mass 
must be completely and univocally determined by them. Mathematically this means that the 
gi; must be completely determined by the T; ,;—naturally, only up to the four arbitrary 
functions that correspond to the possibility of freely choosing the coordinates.* 


In a paper on “Questions of Principle regarding the General Theory of 
Relativity” submitted on March 6th, 1918, the aforesaid demand is distin- 
guished from the “Relativity Principle’—which is here conceived, as on page 
153 above, as a requirement of general covariance—and is listed after the latter 
and the Equivalence Principle as one of the “three main points of view” 
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(Hauptgesichtspunkte) on which General Relativity is based. It is stated as 
follows: 


Mach’s Principle (Machsches Prinzip): The [metric] field is exhaustively 
determined by the masses of bodies. Since mass and energy are the same, 
according to the results of the Special Theory of Relativity, and since the 
energy is formally described by the symmetric energy tensor (T;,), this 
means that the [metric] field is conditioned and determined by the energy 
tensor of matter.® 


The designation for this principle was chosen, says Einstein, because it is “a 
generalization of Mach’s demand that inertia be explained by the interaction 
of bodies”.’ For a like reason he might as well have dubbed it Newton’s 
Principle, as it also generalizes Newton’s similar demand regarding the 
explanation of gravity. But at that time Einstein was not too keen on the 
Newtonian heritage. Indeed, Mach’s Principle was directly associated in his 
mind with Mach’s own criticism of Newton. In the obituary he wrote for the 
Physikalische Zeitschrift when Mach died in 1916, Einstein quotes extensively 
from the relevant section of Die Mechanik (1883), which shows, in his opinion, 
that Mach “was not far from postulating a general theory of relativity half a 
century ago”:? The leading idea of Mach’s criticism was that the state of 
motion of a body can only be judged with respect to other bodies. As Einstein 
had pointed out already in 1913, 


in view of this, it seems meaningless to attribute to a body an absolute resistance to 
acceleration (inertial resistance of bodies in the sense of classical mechanics); the emergence 
of an inertial resistance should be linked rather to the relative acceleration of the body in 
question with respect to other bodies. It must be postulated that the inertial resistance of a 
body can be increased by placing unaccelerated inertial masses around it; and this increase of 
the inertial resistance must again fall away if those masses share the acceleration of the 
body.'° 


That the inertia of a body would increase as other massive bodies were piled up 
in its neighbourhood was a fairly obvious consequence of Einstein’s tentative 
theory of the static gravitational field.’ According to Einstein, the same effect 
held in the Einstein—Grossmann theory. In this theory, too, as required by the 
above analysis, the increment of inertia will not take place if the piled up 
masses participate in the acceleration of the body in question. This effect can 
also be described as follows: The accelerations of the neighbouring masses 
induce on the body an accelerating force in their same direction.'? Moreover, 
as Einstein announced to Mach himself on June 25th, 1913, the 
Einstein-Grossmann theory entails that the rotation (“relative to the fixed 
stars”) of a massive spherical shell about an axis through its centre generates a 
Coriolis field in its interior, which drags along the plane of Foucault’s 
pendulum. !? In the Princeton lectures of 1921 Einstein derived all three effects 
from the Geodesic Law and the linearized field equations of General 
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Relativity.'* Though too small to be accessible to experiment, the three 
Machian effects are formally entailed, according to Einstein, by the general 
theory and are therefore corroborated by its empirical success. They, in turn, 
lend strong support to Mach’s ideas about the relativity of inertia. Einstein 
noted that “if we think these ideas consistently through to the end we must 
expect the whole inertia, that is, the whole g; ,-field, to be determined by the 
matter of the universe”.!> In this way, Mach’s thought does indeed stand 
behind Mach’s Principle and is the key to Einstein’s understanding of it. 

We saw on page 112 how Einstein’s proof that a body at rest loses proper 
mass as it radiates energy had led him to assert that the mass of a body is a 
measure of its energy content. Likewise, from the alleged fact that the inertial 
resistance of a particle is strengthened by piling up masses in its neighbour- 
hood he extrapolated, no less daringly, that all inertial effects must vanish if the 
surrounding masses are pulled back to infinity. As Einstein said in the 
“Cosmological Considerations” of 1917: 


In a consistent theory of relativity there can be no inertia relative to “space” but only an 
inertia of masses relative to each other. Hence if I take a mass sufficiently far away from all 
other masses in the world its inertia must fall down to zero.'® 


It is hard to imagine how the inertia of a particle will fade away as it is re- 
moved from the neighbourhood of all other bodies. Does its worldvelocity 
become more and more erratic, its worldline less and less determinate? What 
was usually assumed—for instance, in Einstein’s own approximate and in 
K. Schwarzchild’s rigorous treatment of the gravitational field of a single 
star’ ’—-was that asa test particle is withdrawn from the influence of a central 
body its guidance gradually devolves on the flat Minkowski metric. The 
assumption implies, however, that inertia is merely influenced, not determined, 
by matter.'® Einstein therefore tried to work out the implications of a different 
assumption, namely, that the metric degenerates at infinity in a definite way, 
instead of just flattening out.'!° This implied that no material particle or light 
ray could ever escape from the bulk of material bodies. Thus the question of its 
behaviour at infinity would not even arise. And the danger of the world 
eventually evaporating—which had always loomed large in Newtonian 
cosmology—would disappear. Einstein, however, was able to show, with the 
assistance of J. Grommer, that a spherically symmetric static metric degenerat- 
ing at infinity in the agreed way cannot be reconciled with the fact, certified 
beyond question by pre-1917 astronomical data, that the stars move quite 
slowly relative to each other (in comparison with the speed of light).?° To solve 
the ensuing difficulty Einstein proposed then one of his most memorable 
hypotheses, which marks the beginning of the wondrous development of 20th- 
century cosmology, namely, that physical space is finite, even though it has no 
bounds. 

There is no need to decide about the inertial behaviour ofa particle infinitely 
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distant from the rest if there is no room for them to come so far apart. We do 
not have to prescribe the conditions at infinity if the world is finite in size. Now 
the idea of a finite yet limitless space was implicit in the very notion of a 
Riemannian manifold, as Riemann himself had already noted in his inaugural 
lecture of 1854.7! The arbitrary Lorentz metrics countenanced by General 
Relativity might for instance be defined on a compact 4-manifold or at any rate 
ona 4-manifold decomposable into a family of compact 3-manifolds which the 
metric makes into spacelike hypersurfaces of finite volume.?? Einstein (1917) 
showed that the field equations of General Relativity, in their modified form 
(D.6) admit an exact solution defined on a 4-manifold diffeomorphic with 
R x S?,?> provided that the following additional assumptions are made: 


I. The distribution of matter is static and homogeneous. 
II. The constant 4 in equations (D.6) is different from 0. 


This was the first cosmological solution of the field equations and is known as 
Einstein’s spherical or static universe. In accordance with (I) the matter 
distribution can be thought of as a pressureless fluid (“dust”) uniformly 
smeared over all spacetime. Every tangent to the worldline of each in- 
finitesimal speck of matter can be regarded as the local value of a vector field, 
the worldvelocity field of the fluid. The spacetime of Einstein’s solution can be 
partitioned into compact spacelike hypersurfaces, orthogonal to the world- 
velocity field, and isometric with S? (up to a constant scale factor; see note 23). 
Thus, if H denotes any such spacelike section of the world and g is the 
restriction to H of the spacetime metric, (H, g) is a proper Riemannian 3- 
manifold with constant positive curvature.?* 

The double assumption that the distribution of matter is (a) static, and (b) 
homogeneous, is meant of course asa first approximation, true only of what we 
may call the “coarse structure” of matter—its “fine structure” being quite 
obviously variegated and changing. To justify assumption (a) Einstein again 
invokes the slow motion of the stars. On the other hand, assumption (5) is 
adopted on purely speculative grounds. Einstein puts it thus: “If we suppose 
that the world is spatially closed it is plausible to assume that the density of 
matter is independent of its position.”?° Extragalactic astronomy has sub- 
sequently confirmed that the universe is indeed homogeneous in the large, and 
delivered us from the illusion that it is in any sense static. Ironically, the 
intragalactic observations that were available to Einstein and persuaded him 
of the truth of (a) also clashed openly with (b), for—as Einstein once 
remarked?°—the stars of the Galaxy are not evenly distributed in the sky. 
Apparently Einstein was willing to extrapolate from the observations at hand 
only when they agreed with his own broad vision of what the world must be 
like. 

The story of assumption (II) is no less instructive from a methodological 
point of view. The 4-term in equations (D.6) was added to the Einstein tensor 


GRAVITATIONAL GEOMETRY 199 


for the sole purpose of obtaining a physically viable solution conformable to 
Mach’s Principle. This seemingly ad hoc change turned out to be very 
reasonable from a mathematical and philosophical point of view, for, as we 
have noted earlier, the modifed Einstein equations (D.6) are virtually the only 
admissible system of field equations for a geometrical theory of gravity 
patterned after Einstein’s “Outline” (see pp. 160 and 323, note 34). But there 
was not the slightest physical inducement for setting 4 # 0. Indeed, in order to 
preserve the agreement of the field equations with the observed facts Einstein 
had to assume that the “cosmological constant” 4 was so small as to be 
empirically indistinguishable from 0. (Observations of distant galaxies by 
Sandage (1961, 1968) place an upper bound on |A| of the order of 107 °° cm” 7.) 
Moreover, as D. J. Raine (1975), p. 515 pointedly observes, a non-zero 
cosmological constant would represent a physical field which acted on 
everything but was not in turn acted upon, just like the “ghostly” linear 
connection ofa flat spacetime rejected by the Machian philosophy. Worse still, 
the hypothesis that 1 # 0 is neither necessary nor sufficient to secure the 
achievement of Einstein’s purpose. It is not necessary, for—as Einstein himself 
remarked in his (1917), p. 152—the unmodified field equations (with 1 = 0)are 
also satisfied by a universe sliced into spacelike 3-spheres provided that the 
distribution of matter is not required to be almost static. (As we shall see in 
Section 6.3 this line of thought was successfully pursued by A. Friedmann in 
1922.) It is not sufficient, for, as W. de Sitter (1917a, b,c) was quick to show, the 
conditions (I) and (II) on page 198 yield besides the solution proposed by 
Einstein still another non-flat solution, consisting of a spacetime altogether 
devoid of matter.?’ To postulate an empty universe may seem the ne plus ultra 
of scientific extravagance, but it is no worse an approximation than Einstein’s 
plenum to what is actually seen through the telescope. In a sufficiently grand 
perspective the visible stars—and indeed the galaxies themselves—could be 
regarded as test particles exerting a negligible influence on the spacetime 
metric.7® Whatever its intrinsic merits, de Sitter’s solution demonstrated that, 
contrary to Einstein’s expectations, there is indeed “a singularity-free space- 
time continuum with everywhere vanishing energy tensor of matter” that 
satisfies the modified field equations.”° 

There are further indications that General Relativity, as embodied in the 
original or in the modified Einstein field equations, not only does not entail 
Mach’s Principle but is ultimately inimical to Mach’s idea. Let us note some of 
them.*° 


First several new non-flat matter-free solutions of the field equations have 
been discovered, which show that an empty universe built in accordance with 
General Relativity can be a very lively place, the interplay of pure gravitational 
energy being enough to produce some remarkable phenomena if a handful of 
test particles is supplied.*! 
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Secondly, it was shown by Carl H. Brans (1962) that the first Machian effect 
derived by Einstein from his field equations, viz. that the inertia of a particle is 
increased by piling up masses in its neighbourhood, can always be wiped out 
by a suitable choice of a coordinate system and does not even show up ina 
coordinate-free description. It is therefore a chart-dependent mirage, not a 
true physical effect. Brans’ result cannot take us by surprise, for, as we observed 
on page 319, note 6, in a relativistic spacetime it is always possible to define a 
chart along the worldline of any particle, relative to which all the gravitational 
field components I"}, vanish at every point of that worldline. In memory of its 
discoverer, such a chart is called a Fermi chart. When referred to it, the 
particle’s behaviour appears to be unaffected by the gravitational influence of 
the rest of matter, and its inertial tendency to persevere in or relapse into a 
geodetic path cannot be linked to that influence. In particular, if the particle is 
falling freely, a Fermi chart along its worldline is what we called earlier a local 
Lorentz chart (p. 136), Although the particle’s geodesic trajectory and the local 
Lorentz charts definable along it strictly depend on gravity, the Principle of 
Equivalence implies that physical phenomena referred to such charts obey the 
laws of Special Relativity and show no effects of the surrounding mass 
distribution. In this sense, as Steven Weinberg has ably observed, “the 
equivalence principle and Mach’s principle are in direct opposition”.° On the 
other hand, the remaining two Machian effects claimed by Einstein (p. 196) do 
indeed take place in General Relativity under suitable initial and boundary 
conditions. H. Thirring (1918, 1921) and J. Lense and H. Thirring (1918) 
showed, respectively, that a massive rotating hollow sphere generates a 
Coriolis and a centrifugal field in its interior, and that a massive rotating 
central body induces an acceleration on nearby test particles. Their results are 
indeed questionable from our point of view, for they used the linearized field 
equations (p. 325, note 35), thereby presupposing a flat background metric, in 
flagrant violation of the Machian idea.**> However, according to A. K. 
Raychaudhuri (1979, p. 241), A. Lansberg has proved—to first order in the 
angular velocity of the rotating shell—that Thirring’s effect also obtains in 
Einstein’s static universe. 


Thirdly, the conceptual system of General Relativity provides the means of 
giving a definition of rotation in the neighbourhood of a particle which is not 
tied to a frame of reference and is altogether independent of the distribution of 
matter. This is easily done if the particle describes a timelike geodesic y. Let e) 
be the worldvelocity of the particle ata worldpoint P. We can always find three 
spacelike vectors e,, 3, ¢@3 at P which form with eg an orthonormal tetrad (i.e. a 
basis of the tangent space at P such that gp(e;, e;) = 4;;, where g denotes as 
usual the spacetime metric). Let E, be a vector field, parallel along y, whose 
value at P is e,. We say that E,, E, and E, constitute a set of non-rotating axes 
ora “compass of inertia” along the geodesic y. It will be readily seen that at any 
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point of y the worldvelocity of the particle forms an orthonormal tetrad with 
the local values of the E,. A compass of inertia with similar properties can be 
defined along an arbitrary timelike curve by means of the concept of Fermi 
transport.** Let y be such a curve, parametrized by proper time, and let 
E = (E,, E;, E;) be a compass of inertia along it. Consider a field V of unit 
vectors whose value at each point P of y is orthogonal to the local tangent yp. 
We shall naturally say that V rotates with respect to E unless the three 
“direction cosines” g(V, E,) are constant along y. It will be noted that if V 
rotates with respect to E it also rotates with respect to any other compass of 
inertia E’ constructed along y according to our definition. The instantaneous 
rotation of V at each P in the range of y can be represented by a vector in the 
space spanned by E at P, which is of course spanned also by the local value of 
E’. (One may thus meaningfully and indeed correctly assert, within the 
framework of General Relativity, that Copernicus and not Ptolemy was right.) 
Let U denote the worldvelocity field of a fluid. The local rotation of each small 
volume of fluid with respect to the compass of inertia along the integral curves 
of U is measured by the fluid’s vorticity.*> The Einstein field equations admit 
exact solutions in which the cosmic distribution of matter has non-zero 
vorticity. In such cases it makes good sense to say that matter rotates in space. 
Nobody claims that we actually live in such a cosmic merry-go-round, but the 
fact that a rotating universe can be rigorously conceived as just another 
instance of the same mathematical structure of which our world is very nearly a 
model, should make every Machian shudder.*°® 


Soon after 1918 Einstein became convinced that General Relativity does not 
subordinate the metric field to the energy tensor of matter but brings them 
both into a dynamic interplay. In a lecture on “Ether and the Theory of 
Relativity”, delivered on May Sth, 1920, while still proclaiming the Machian 
inspiration of the theory, Einstein likened the spacetime continuum to an 
ether, i.e. a physical medium “which is itself devoid of all mechanical and 
kinematical qualities, but helps to determine mechanical (and electromagnetic) 
events”.°’ In turn, the metrical properties characteristic of this medium differ 
in the neighbourhood of its several points “and are partly conditioned by the 
matter existing outside the region under consideration”.** Although he was 
never happy about this duality of metric and matter, his successive attempts at 
a unified field theory sought to overcome it by deriving the properties of matter 
from the spacetime geometry, rather than the other way around. In the course 
of his reflections he grew increasingly wary of Mach’s brand of empiricism, 
whose strongly mentalistic undercurrent had never really got hold of him. In 
1937 he wrote for Carlton Berenda Weinberg a sort of balance sheet of Mach’s 
pros and cons. Mach had shattered the dogmatism of 19th century physics by 
exposing the hypothetical or logically arbitrary nature of its basic concepts and 
laws. He had rightly criticized Kant’s apriorism with regard to the categories 
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and some fundamental principles. But he had erroneously believed that 
conceptual knowledge grows “out of sense data without free creation of 
concepts”.>° Near the end of his life Einstein sent to Felix Pirani a detailed 
appraisal of Mach’s Principle. Mach “sought to abolish space and replace it by 
the relative mutual inertia of ponderable bodies”. “This certainly did not work, 
quite apart from the fact that time remained with its absolute character.” When 
people “speak today about Mach’s Principle (Machs Prinzip) they do not mean 
to abolish the continuum but to preserve the field. But they think that the field 
ought to be completely determined by matter. This however is a ticklish affair 
(eine heikle Sache) for the T,, which are to represent ‘matter’ always 
presuppose the g;,.”*° “In effect, T,, is the part of the total field of which we 
know that, for the time being, we can only superficially (phenomenologically) 
take its structure into consideration.” Einstein sums up his final position thus: 


In my opinion we ought not to speak about the Machian Principle (dem Mach’schen Prinzip) 
any mote. It proceeds from the time in which one thought that the “ponderable bodies” were 
the only physical reality and that all [concepts that could not be traced back to them] 
elements that could not be fully determined by them ought to be avoided in the theory. Iam 
well aware that for a long time I too was influenced by this fixed idea.*! 


Nevertheless, during the fifties and sixties Mach’s Principle persistently 
engaged the attention of scientists. Some of them acknowledged that it clashed 
with General Relativity and proposed new theories of gravity which pre- 
sumably satisfied it.4? Others altered its meaning and scope to make it usable 
in General Relativity.**:** Enthusiasm for Mach’s Principle has waned in the 
seventies, together with the ideological motivation that kept it alive. 
M. Reinhardt (1973) summarizes the conclusion of his excellent account of the 
subject in the statement that Mach’s Principle, though an extremely stimulat- 
ing thought, has at present little claim to be a basic physical principle. An 
interesting exception is D. J. Raine (1975) who, having characterized Mach’s 
Principle as the triple assertion that anly relative motion is observable, that 
inertial forces arise from a gravitational interaction of matter alone, and that 
the spacetime metric is totally dependent on the matter content of the universe, 
sets up mathematical criteria for selecting the global solutions of the Einstein 
field equations which satisfy that principle. Only such solutions should be 
considered physically viable according to Raine.** 


6.3 The Friedmann Worlds.! 


Relativistic cosmology arose, as we saw, from philosophical reflections 
concerning the relationship between spacetime and matter. Initially it only 
invoked basic, qualitative data, which had been available for a long time. 
Deftly using the conceptual resources of Riemannian geometry it gave, at its 
very outset, a highly satisfactory solution to the problem of the world’s 
extension in space. If, as Einstein proposed, the universe is finite yet boundless, 
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there is no point in asking with Archytas what will become of a missile thrown 
outwards from the farthermost star, nor—it seems—need one fear with Kepler 
that if stars exist everywhere with the same average density as in our 
neighbourhood, the firmament will outshine the sun.” Einstein’s cosmology 
contributed perhaps more than anything else to his fame among the educated 
classes right after the first World War. Weltering in novelty as no generation 
before them, the contemporaries of Pound and Joyce, Schonberg and Bartok, 
Matisse and Mondrian, greeted the man who had given a shape to the universe 
as the epitome of creative genius. And yet, mind-boggling as it seemed then, 
Einstein’s spherical world was a staid, almost philistine invention in com- 
parison to what came next. A new spate of astronomical data, following a great 
technical breakthrough, found in General Relativity the fitting framework for 
its interpretation. The unexpected agreement of theory and observation put 
new life and drama into the hackneyed idea of evolution and supported, for the 
first time in the history of rational inquiry, the ancient belief that the universe 
has existed only during a limited time, that no piece of matter, however 
elementary, has been around for more than so many years. 

Very sketchily, the story can be told as follows. In the early twenties, the 
brightest spiral nebulae were resolved into stars, including some of the 
Cepheid type. These are variable stars whose brightness oscillates with a 
period related to their intrinsic luminosity. Assuming that the period- 
luminosity relation established for nearby Cepheids by H. Leavitt also held for 
the newly discovered ones, E. Hubble estimated that the great Andromeda 
nebula (M 31 = NGC 224) lay at a distance of 900,000 lightyears and was 
therefore a star system not substantially smaller than the Galaxy. Assuming 
that the fainter nebulae were roughly of the same size as the brighter ones, he 
calculated distances up to eight times larger. This meant a colossal increase in 
the size of the visible world and the amount of matter contained in it.> V. M. 
Slipher had been studying for some time the spectra of nebulae—or galaxies, as 
we call them now—1in order to determine their radial velocities. It turned out 
that most of them exhibited significant redshifts, indicative of recession at 
surprising speeds. A list furnished by Slipher to Eddington in February 1922 
and printed in the 1923 edition of the latter’s Mathematical Theory of 
Relativity included 41 galaxies of which 36 were receding with speeds of up to 
1800 km/s.* Half this list, with four additions and five minor corrections, 
provided the sample on which E. Hubble (1929) based his momentous claim 
that the radial velocity of the receding galaxies is proportional to their distance 
from the Earth.° A couple of years later, M. L. Humason (1931) published a list 
of galaxies with receding velocities of almost 20,000 km/s. W. de Sitter (1917) 
had noted that his solution of the Einstein field equations entailed “a 
systematic displacement towards the red of the spectral lines of nebulae”.® At 
first, therefore, Slipher’s data were generally regarded as an indication that 
de Sitter’s spacetime, notwithstanding its emptiness, was a much better model 
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of the real world than Einstein’s.’ However, the Soviet mathematician 
A. Friedmann had shown as early as 1922 that the universes of Einstein and 
de Sitter were merely limiting cases of an infinite family of solutions of the field 
equations for a positive but varying matter density, any one of which would 
entail, at least for some time, a general recession—or a general oncoming, for 
the solutions are indifferent to time reversal—of the galaxies. Friedmann’s 
work was carried out in Leningrad during the Russian civil war, presumably in 
complete ignorance of Slipher’s measurements. His simplifying assumptions 
are much weaker than either Einstein’s or de Sitter’s and thus define a likelier 
idealization of the real world. Einstein’s theory of gravity predicts the 
expansion—or contraction—of any universe that roughly agrees with 
Friedmann’s assumptions. If the agreement is perfect and the cosmological 
constant J < 0 (cf. pp. 198 f.) the worldlines of matter—i.e. the integral curves 
of the worldvelocity field of the matter distribution-—are all focused on a 
mathematical point. If the universe is now expanding only a finite proper time 
can have elapsed along each of them since they parted their ways. (This is true 
also when A > 0, if certain other conditions are fulfilled.) 

The Friedmann solutions are hailed nowadays as “the most provocative and 
important cosmological model which has been devised since Bruno”.® 
However, when Friedmann published them in the German Zeitschrift fiir 
Physik they attracted little attention. Some potential readers may have been 
deterred by Einstein’s laconic remark, in the same journal, that Friedmann’s 
non-stationary universes are incompatible with the field equations.’ At any 
rate the Belgian G. Lemaitre had apparently never heard of the Friedmann 
solutions when he rederived them in 1927 with the explicit aim of “accounting 
for the radial velocity of extra-galactic nebulae”.'° Soon thereafter the 
American cosmologist H. P. Robertson (1929) set out to discover the most 
general metric for a relativistic spacetime in which no preferred spatial 
directions are discernible about any average worldline of matter, and came up 
with the same family of solutions that had been found by Friedmann and 
Lemaftre.!! That the world, on a cosmic scale, does indeed look the same to us 
in every direction is a generally accepted fact. Hence, if we profess the 
“Copernican” view that there is nothing special about our own vantage point, 
we must conclude that, as Robertson assumed, the universe is isotropic 
everywhere. This conclusion is confirmed almost beyond question by the 
nearly perfect isotropy of the microwave background radiation discovered by 
A. A. Penzias and R. W. Wilson (1965). What Robertson showed was that a 
universe shaped according to Einstein’s field equations, if it is thus roughly 
homogeneous and isotropic, must approximately model one of the Friedmann 
solutions. This cannot be the static one discovered by Einstein in 1917 for, as 
Eddington (1930) proved, it is essentially unstable and will expand to infinity 
or contract to a point at the slightest disturbance. Enlightened by Hubble’s 
discovery and probably shaken by Eddington’s proof, Einstein (1931) declared 
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his preference for the non-static Friedmann solutions with 4 = 0, which, he 
noted, enabled one “to do justice to the facts without introducing the 
theoretically anyway unsatisfactory 4-term”.'? In the Appendix added to the 
second edition of The Meaning of Relativity (1945), Einstein said: 


As Friedman was the first to show one can reconcile an everywhere finite density of matter 
with the original form of the equations of gravity if one admits the time variability of the 
metric distance of two mass points. The demand for spatial isotropy of the universe alone 
leads to Friedman’s form. It is therefore undoubtedly the general form, which fits the 
cosmological problem.'? 


Friedmann (1922) based his work on two kinds of assumptions, which 
concern (I) the gravitational field and the state and motion of matter, and 
(II) “the general, so to speak geometrical character of the world.” The 
assumptions of the first class are: 


(I.1) The gravitational field satisfies the modified Einstein equations 
(D.7), where 4 can take any value, including 0. 

(I.2) Matter can be represented as a pressureless fluid, at rest relatively to 
the coordinate system (x°, x!, x”, x3) that will be introduced below. 


(1.2) implies that the integral curves of 0/0x° agree, up to a reparametrization, 
with the worldlines of matter. 

With a minor amendment (see note 14), Friedmann’s assumptions of the 
second class can be stated as follows: 


(II.1a) Spacetime admits a global time coordinate function x°. 

(I1.1b) The restriction of the spacetime metric to any spacelike hyper- 
surface x° = const. constitutes on the latter a proper Riemannian 
metric with constant curvature K, which depends on the value of 
x° and is either more than, equal to, or less than 0 for every such 
hypersurface.** 

(11.2) The hypersurfaces x° = const. are everywhere orthogonal to the 
integral curves of 0/0x°. 


Assumption (II.1a) expresses the traditional belief that spacetime can be split 
into time and space in at least one way. Friedmann apparently found it so 
obvious that he did not bother to mention it. We now know that it holds good 
in a relativistic spacetime if and only if the latter is “stably causal”, in the sense 
defined in note 23 to Section 6.4 (see page 337). Assumption (II.1b) implies that 
each three-dimensional space of simultaneous worldpoints is homogeneous 
and isotropic as far as the geometry is concerned—i.e. that each point on it is 
equivalent to every other point and that there are no preferred directions about 
any of them. Friedmann may have adopted (II.1b) as a geometrical statement 
of the observed isotropy of the universe together with the “Copernican” thesis 
that all contemporary standpoints are interchangeable; or else as a statement 
of the then common opinion that physical (space) geometry places no 
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restrictions on rigid motion.’> Friedmann notes that “no physical or 
philosophical reasons can apparently be given for assumption [TI.2]. It serves 
only to simplify the calculations.”'® This is perhaps the weakest spot in 
Friedmann’s work, for (II.2) in conjunction with the assumptions of class | 
imposes severe restrictions on the distribution of matter, which are ultimately 
responsible for the peculiar properties of the Friedmann solutions, while 
assumptions (IJ.1) are compatible with a much wider variety of world- 
models.'’ Therefore (11.2) cannot be accepted merely because it is expedient, 
but only on solid rational and empirical grounds. Such grounds, however, were 
supplied by Robertson. 
Suppose that we take (II.La) for granted and, as Robertson puts it 


we demand that to any stationary observer (“test body”) in this idealized universe all (spectal) 
directions about him shall be fully equivalent in the sense that he shall be unable to 
distinguish between them by any intrinsic property of space-time, and he shall similarly be 
unable to detect any difference between his observations and those of any contemporary 
observer.'® 


It follows at once that spacetime must admit a group of motions (i.e. a 
connected Lie group of isometries) that map each hypersurface x° = const. 
onto itself, either by rotating it about an arbitrary point (a momentary 
“observer”) or by translating any such point to any other one. Since arbitrary 
rotations and translations in a 3-manifold involve 6 degrees of freedom, the 
group in question must be a 6-parameter Lie group. Robertson invokes a 
theorem by G. Fubini (1904) to the effect that a 6-parameter group of motions 
acting on a Riemannian 4-manifold acts transitively as a group of motions on 
each member of a family of hypersurfaces with the following properties: 


(F1) the hypersurfaces are orthogonal to a congruence of geodesics; 

(F2) each pair of hypersurfaces intercepts segments of equal length on 
every geodesic of the congruence; 

(F3) the mapping that sends each point P of the hypersurface H to the 
intersection of the geodesic through P with the hypersurface H’ is a 


conformal mapping of H onto H’.'° 


Obviously, Robertson’s demand entails Friedmann’s assumption (I].2), pro- 
vided that the integral curves of C/Cx® are geodesics. The requirement is a 
natural one in General Relativity if, as we stipulated, ¢/¢x° is everywhere 
tangent to the worldlines of matter.?° (Note that, by (F2), x° must then 
measure proper time, up to a constant scale factor.?!) Robertson’s demand 
also entails (II.1b).?? For in order to admit a 6-parameter group of motions, 
each 3-dimensional hypersurface x° = const. must be a Riemannian manifold 
of constant curvature;?? and (F3) implies that the curvatures corresponding to 
the several values of x° havea positive ratio to one another—unless they are all 
zero. 

Later, Robertson (1935) derived assumption (II.la) from the following 
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premises, which may be thought to supplement or merely to clarify his earlier 
demand regarding the equivalence of all standpoints and perspectives: (a) the 
“observers” which are claimed to be equivalent describe a congruence of 
worldlines (ie. each worldpoint lies on the worldline of one and only one of 
them); (b) two worldpoints can be joined at most by one “light-path” or 
worldline of a direct light signal. (c) the path of every light signal that an 
“observer” A can receive directly from an “observer” B lies on a 2-manifold 
(AB); (d) (AB) is identical with (BA). Robertson shows that if these four 
conditions are met all observers can synchronize their clocks and thus define a 
global time coordinate function.?* 

Conditions (II.1a), (F1), (F2) and (F3) entail that there exists a spacetime 
chart x—with the time coordinate function x°—relative to which the line 
element takes the form 


ds? = M?(dx°)? — Sg, ,dx%dx? (6.3.1) 
B 


where the g,, depend only on the space coordinates x*, while M and S are 
functions of x° alone.?> We can set M = 1 by equating x° with proper time 
along every worldline of matter (see note 21). It is then customary to write t for 
x°. S is the scale factor relating the conformal metrics of the several spacelike 
hypersurfaces t = const. Since S cannot be equal to 0 for any value of the 
coordinate function 1, it does not change signs, and we shall assume that it is 
always positive. (On the implications of the contrary assumption see note 31.) 
Due to the isotropy of the hypersurface t = const. about each point, the space 
coordinate functions x* can be so chosen that, if we write (r,6,@) for 
(x1, x7, x9), gi, = (l—kr?)"', Ga. = 1r?, 933 =r? sin? 0 and g,, = 0 whenever 
a # B; k being equal to 1, —1, or 0 if the curvature of the hypersurfaces 
T= const. is, respectively, positive, negative, or flat.2° Referred to these 
coordinates, the line element (6.3.1) becomes 


ds? = dt* —S? ( +r? (d0? + sin? 0 do?)) (6.3.2) 


dr 
1 —kr? 
We call (6.3.2) the Robertson-Walker line element. It clearly brings out the fact 
that in a spacetime compatible with Robertson’s requirements, the spacelike 
separation between simultaneous worldpoints along two given worldlines of 
matter may grow or diminish in the course of time as a consequence of the 
geometry alone (just like the distance between places of equal latitude along 
two given meridians increases from the poles to the Equator because such is the 
shape of the Earth.) The frequency shift of light sent from one such worldline to 
another can be directly obtained from (6.3.2) if it is assumed that light describes 
null geodesics. Suppose that a material particle A radiates light at time t = t, 
during a short time At, , which is received by a particle Bat time t = 1, during 
a short time At,. Since the number of lightwaves emitted is the same as the 
number received, the frequencies v; (i = 1, 2) at emission and reception satisfy 
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the relation v, At, = v,At,. We assume, of course, that on its way from A to B 
the light does not suffer any accidents that can alter its frequency, so that any 
difference between v, and v, is exclusively due to the shape of the metric field. 
The worldline of a light signal must satisfy the differential equation: 


dt* —S?do? =0 (6.3.3) 


where I have written do? for 9,,dx*dx*. The coordinate distance o from A to B 
can be calculated by integrating (6.3.3): 


o= | S~'dt (6.3.4) 
Since a is constant it is also given by 
T + At, 
o= | S~'dr (6.3.5) 
1+Ay 


If we now subtract (6.3.5) from (6.3.4) and ignore the small change in S during 
the short periods of time At, we find that: 
vy, Ay Sy 
Wp Ate S83 
where S, stands for the value of the scale factor S at time t;. The frequency shift 
z, conventionally defined as the ratio of the frequency loss to the observed 
frequency, is therefore given by: 


(6.3.6) 


Pees a oa (6.3.7) 
V2 S; 


zis positive—a “redshift”—if the scale factor S is larger at t, than it was at t,— 
i.e. if the universe has “expanded” in the mean time.”’ 

From the metric components g;; implicit in (6.3.2) we obtain, by straightfor- 
ward application of formulae (D.1), (D.2) and (D.4) on p. 281, the Ricci 
components - 

Roo = 35/S Ry, = 90 (6.3.8a) 
Rag = Rag — (SS + 287) Gu 5 (6.3.8b) 


where I have written S and $ for dS/dt and d?S/dzt’, respectively, and the Ry, 
are the Ricci components determined by the “spatial” metric do.?® It can be 
shown that R,, = — 2kG,,,"? so that 


Rag = — (SS +28? + 2k)Gap (6.3.8b') 


By assumption (I.2) the components of the energy tensor relative to the chosen 
chart are Tyg = p,and T;; = O unless i = j = 0. Since the metric is linked to the 
distribution of matter and the hypersurfaces t = const. are homogeneous and 
isotropic, the energy density p must depend on 1 only.*° 
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Substituting the above values for g;;, R,; and T;; into the Einstein field 
equations (D.6), we obtain the following—omitting the equations in which 
both sides turn out to be identically 0—: 


= —A= —4xp (6.3.9a) 
S823 QR 
Sf ee Pep he aKP (6.3.9b) 
These are the gravitational field equations for the Friedmann worlds. (6.3.9a) is 
equivalent to ae 
= -s(‘2-3) (6.3.10) 


Consequently if in a given Friedmann world (W, g), 4 < 0, S must be negative. 
In that case, S cannot have a minimum and must tend to 0 as t approaches 
some finite value ts. (The B may stand for “Beginning” or “Big Bang”.) 
Specifically, if Sy denotes the value of S at time t = 0, 


Ta= S-'ds (6.3.11) 
So 
and will be negative (“to the past of time 0”) if the “expansion rate” 
Sy = dS/dt|, > 0.3! As I remarked in passing on page 204, a similar situation 
occurs under certain conditions even if 1 > 0; for instance, if k < 0.°? 

Since the line element (6.3.2) collapses if S = 0, it is clear that the domain of 
the chosen‘chart cannot contain a worldpoint P such that t(P) = tg. This in 
itself is not surprising, for a spacetime chart need not and normally will not be 
global. Thus, for example, a quick glance at (6.3.2) will persuade us that, if 
k = 1, the domain of the chosen chart cannot contain a worldpoint P such that 
r(P) = 1. Nevertheless, a point sequence defined along a given parametric line 
of r by the rule P, = r~'(1 —2~") will converge, as n goes to oo, to a definite 
worldpoint P,, lying in the domain of another chart.?? However, the situation 
that arises as t approaches tz is not just a consequence of our choice of a 
particular chart, for, as we shall now see, the curvature scalar Ri—which is, of 
course, chart-independent—increases beyond all bounds as t goes to Tg. 
Multiplying both sides of the field equations (D.6) by g‘/ and summing over j 
we verify that Ri = xT + 42. Solving the system (6.3.9) for T} = p we find that 


3/8? kG 
a (ener ene 3.12 
p *(Rtp 4 (6.3.12) 


Multiplying both sides of (6.3.12) by S? and differentiating with respect to t, we 


oa d 3882/28 $2 k 
— (pS?) = : - 3. 
4, eS) = — (Frat ye i) (6.3.13) 
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Now, the right-hand side of (6.3.13) is equal to 0, as can be readily seen by 
adding (6.3.9a) to (6.3.9b). Hence, pS? is a constant and the energy density p 
must increase without bounds as S approaches 0. Since x and / are constants, 
the same can be said of the curvature scalar Ri = xp+4/. Thus in our 
Friedmann world (W, g) the sequence of worldpoints P, = t~'(tg+2~"), 
chosen along a worldline of matter, does not converge to a point P,, as n goes 
to oo, for if such a point existed, the curvature scalar would not be defined at 
it.>+ Hence, if one has a liking for rhetoric, one may well say that (W, g) came 
out of nothing at time t = tgand that |7,| is, at t = 0, the time elapsed “from 
the Creation of the World”.*> But one ought to bear in mind that in the chosen 
chronology tz 1s not really a time at all, but only the greatest lower bound of 
the range of values of the time coordinate, and that at any assignable time the 
Friedmann world (W, g) already has a past. 


6.4 Singularities 


The situation discussed at the end of the preceding section is commonly 
described by saying that a Friedmann world of the type in question has a 
singularity at t = tg. This manner of speech derives from the classical field 
theories of gravity and electricity, where the functions that characterize the 
field often do “have singularities” in a sense akin to that of function theory.’ 
Take for example the Newtonian gravitational field of a single massive particle. 
The field strength at any given point P is inversely proportional to the squared 
distance r? from P to the particle and therefore increases beyond all bounds as 
r tends to 0 and is undefined at the point where the particle is located. This, of 
course, does not in the least compromise the existence of that point, which is 
firmly held in place by its geometrical relations to the rest (cf. the quotation 
from Newton on p. 285, note 7), and can be aptly described as a singular point 
of the field. In General Relativity, however, one cannot meaningfully say that 
the gravitational field is undefined or “becomes singular” at a worldpoint, 
because the field itself is responsible for the geometry and is therefore 
constitutive of spacetime. Obviously a point P can only belong to the structure 
(W, g) if the metric g is defined at P. Thus, the Friedmann world on page 209 
can only improperly be said to “have a singular point” where the time 
coordinate t of (6.3.2) takes the value tg defined by (6.3.11). For, as I have 
already noted, under the stipulated conditions, that world can contain no 
points at which t = tg. The impropriety is indeed no worse than other abuses 
of language endemic in mathematical literature, and I would not have cared to 
mention it were it not for the confusions it has caused in philosophy, and even 
in ordinary conversation, with regard to the so-called “beginning” of such 
Friedmann worlds. However, as things stand, one ought to be meticulously 
precise on this matter, at least for some time, until the confusions are dispelled. 

If the so-called singularities of the Friedmann worlds and other solutions? 
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to the Einstein field equations cannot be simply understood on the analogy of 
the singular points of the classical field theories and the theory of functions, it 
is all the more necessary to have an appropriate concept with which to grasp 
them. Several such concepts have been proposed since the sixties, when the 
subject became the focus of active and successful research, and although none 
of them is altogether satisfactory they are apt to be very illuminating, as I hope 
to show. All these concepts rest on the perception that a spacetime in a way 
lacks something there where one would naively say that it has a singular point. 
They aim at defining the nature of this lack and specifying its relation to the 
points that spacetime actually does have. Now in physics we want to 
understand nature, as far as possible, purely in terms of itself. We may not 
therefore go beyond spacetime and its content in order to conceive its 
privations. It may sound as if this would make our task impossible, but 
mathematical thought has already proved its mettle by defining and even over- 
coming the incompleteness of certain structures without resorting to anything 
outside them. The standard example of this maneuver is the completion of 
incomplete metric spaces. Let (M, d) be a metric space, in the sense defined on 
page 303, note 1. A sequence P,, P,,... of points of M is said to be a Cauchy 
sequence if, for every positive real number 6, no matter how small, there is an 
integer N such that 6 exceeds the distance between any two points which occur 
in the sequence after Py. (d(P,, P,) <6 whenever N <r,s). The points of a 
Cauchy sequence (P,,) are therefore packed closer and closer as n increases, and 
one would expect them to pile up about a definite point Pe M. This, however, 
is not always the case. (M, d) is said to be metrically incomplete unless every 
Cauchy sequence of points of M converges to a point in M. Cauchy sequences 
do not only enable us to spot the gaps in (M, d) without having to look beyond 
it, but also provide the means for filling those gaps, so to speak, out of (M, d) 
itself. We define the distance between two Cauchy sequences (P,) and (Q,) as 
the limit, in R, of d(P,, Q,). Two Cauchy sequences are equivalent if their 
distance is equal to 0. Let M be the collection of all equivalence classes of 
Cauchy sequences of (M, d). Let dassign to any two “points” of M the distance 
between two representative elements. (M, d) is then a metric space. We embed 
M into M by equating each Pe M to the class of Cauchy sequences converging 
to P. d evidently agrees with don M c M. It is not hard to see that (M, d) is 
metrically complete. We call it the Cauchy completion of (M, d). Clearly it is 
made of ingredients drawn from the latter.? 

This example is not only philosophically interesting by itself, but also 
directly relevant to our problem, as we shall now see. Let M be an n-manifold 
with some additional structure and let y:! + M be a curve of some specified 
kind (e.g. a causal curve, if M is a causal space, etc.). y is said to be an 
inextensible curve of its kind unless there is a curve y’: I’ + M of the same kind 
such that I’ # I,1 c I'andyis the restriction of y’ to [. y isan incomplete curve 
of its kind if it is inextensible and / is a finite interval. Let M have a linear 
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connection [. We say that [ is incomplete and that (M,T) is geodesically 
incomplete if there are incomplete geodesics in M. If M isa proper Riemannian 
manifold it is also a metric space, with the distance function defined on page 
95. It can be shown that M is then metrically incomplete, in the sense 
discussed above, if and only if it is geodesically incomplete (with respect to the 
Levi-Civita connection).* The strict equivalence of metric and geodesic 
incompleteness in the proper Riemannian case suggests that we treat the latter 
as the natural generalization of the former to improper Riemannian 
manifolds—such as relativistic spacetimes—to which, for lack of a distance 
function, the metrical concept does not apply.* This suggestion inspired the 
first fruitful approach to the question of singularities in General Relativity, 
leading to the remarkable “singularity theorems” of Roger Penrose and 
Stephen W. Hawking. This approach, which lies at the root of all that have 
followed, is based on the idea that a singularity-free spacetime is one that is 
geodesically complete, or, more exactly, one in which all inextensible timelike 
and null geodesics are complete, i.e. defined on the entire field of real numbers.° 
The italicized qualification is due to the fact that not all spacetime geodesics are 
equally meaningful from a physical point of view. Thus, while the presence of 
an incomplete timelike geodesic implies that there might be “freely moving 
observers or particles whose histories did not exist after (or before) a finite 
interval of proper time”,’ the actual significance of incomplete spacelike 
geodesics is less clear. The said approach should have sufficed to dispel the air 
of paradox surrounding the notion of a spacetime singularity since the early 
days of General Relativity,® for evidently there is nothing impossible or even 
strange about a geodesically incomplete spacetime. After all, barring meta- 
physical insight or divine revelation, I do not see how we could ever learn that 
spacetime is geodesically complete—and all the more reputable sources of 
supernatural information decidedly point to the opposite conclusion. Indeed 
the venerable vision of man as a microcosm, and hence of the universe as Man 
Writ Large, can only be upheld within the framework of General Relativity if 
spacetime contains incomplete timelike geodesics along which, so to speak, 
time runs out. 

In order to see more clearly how the incompleteness of null or timelike 
geodesics can account at least in part for the situations traditionally described 
as spacetime singularities, let us again consider the Friedmann world (¥#,, g) 
discussed on pages 209 f. Take a finite open interval J such that tgeJ <R. The 
spacetime region t'(J) < W is then an open neighbourhood of each of its 
points, homeomorphic to R*.° Every worldline of matter contained in t~1(J) 
is a timelike geodesic whose maximal extension within the region—when 
parametrized by the proper time t—is defined on the intersection of J with the 
set {u|tg <u} CR. Hence t '(J) is by itself a geodesically incomplete 4- 
manifold. It is contained, of course, in a vaster spacetime region, into which the 
worldlines of matter are further extended, but only “towards the future”, i.e. in 
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the direction of increasing t. But no inextensible matter worldline of (W; g)can 
be defined on all R. The real number tg, or whichever takes its place in a 
reparametrization, is necessarily excluded from the domain of definition of all 
such worldlines.'° And there is no way of isometrically embedding (W,, g) into 
a vaster spacetime in which its incomplete timelike geodesics would be 
completed. Thus the incompleteness of t~'(J) is not remedied by extending 
this region to the full Friedmann world (W;, g), for the latter is constitutively 
and incorrigibly geodesically incomplete. As I said before, this is something 
that a Riemannian manifold can jolly well be. 

The region t~'(J) in which, so to speak, the incomplete geodesics of our 
Friedmann world are born is one where, if J is very small, the average energy 
density will be so great that matter cannot subsist there in any known guise. 
This suggests that the laws of physics cannot hold in that region in their 
familiar form. In particular, the Einstein field equations, which have never 
been tested under such extreme conditions, may stand in need of correction. 
The same considerations apply to all spacetimes in which geodesics run into 
regions where the curvature scalar increases indefinitely (see note 2) and entitle 
us to question the reality of geodesic incompleteness in all such cases.'! 
However, if physical theories were to be countenanced only within the range of 
circumstances in which they have been actually tested, the scope of theoretical 
physics would become so narrow as to stifle all its chances of growth. 
Cosmology, like other branches of natural science, is speculative and 
conjectural, and its conclusions are hypothetical, i.e. true if the postulated 
conditions hold good. However, to disqualify the latter while they still agree 
with all available facts of observation merely because their consequences run 
contrary to our preferences or prejudices is a gross violation of the rules of the 
game. Of course, if a particular solution of the Einstein field equations 
turns out to be geodesically incomplete this will be due to the initial and 
boundary conditions on which that solution depends, rather than to the field 
equations themselves, which also admit geodesically complete solutions. Since 
the conditions that have to be assumed in order to integrate the equations are 
uncommonly simple and neat, one might reasonably expect that under the 
rougher circumstances of real life the predicted incompleteness would not 
arise.!? In the case in point, however, such expectations suffered a rude blow 
when A. K. Raychaudhuri (1955) proved that adding density perturbations 
and anisotropic expansion (shear) to a Friedmann world cannot prevent the 
occurrence of incomplete timelike geodesics. They were finally quashed by the 
theorems of Penrose and Hawking. 

The singularity theorems assert that a relativistic spacetime is always 
geodesically incomplete when some very general and fairly realistic conditions 
are fulfilled. To state these conditions concisely I must first give some 
definitions. Let (W; g) be a relativistic spacetime satisfying Einstein’s field 
equations (D.7). We shall say that (W, g) is generic if each timelike or null 
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geodesic y goes at least through one worldpoint P at which the spacetime 
curvature induces expansion or contraction or shear on each local congruence 
of geodesics that includes y. (By “local congruence” I mean a congruence in the 
open submanifold constituted by a small neighbourhood of P.)!* This means 
that every geodesic worldline of matter or light includes at least one event at 
which gravitational curvature is, so to speak, effective. Every plausible, not 
inordinately symmetric spacetime is generic in this sense, and even one which is 
not so, becomes generic if the metric is slightly perturbed. We shall say that 
(W, g) meets the weak convergence condition if the Ricci tensor everywhere 
satisfies the inequality R,;V'V’ < 0 for every null vector V. (W, g) meets the 
strong convergence condition if R,,;V'V’ < 0 for every timelike vector V. The 
convergence conditions imply that matter—in whose presence only does R;; 
differ from O—has a focusing (or rather, a non-unfocusing) effect on 
congruences of null or, respectively, timelike geodesics. By continuity, the 
strong condition entails the weak one. They may be regarded as a formal 
statement of the fact that matter attracts nearby matter and radiation.!* A 
trapped surface in (W, g) is a compact spacelike smooth 2-surface S such that 
both the incoming and outgoing future-directed null geodesics that meet S 
orthogonally converge locally at S. (A trapped surface could be, for instance, 
the boundary ofa gravitational “sink”, a spatial region which is the seat of such 
a strong gravitational pull that all radiation originating inside it is sucked back 
into it.) 
I shall now state without proof the main singularity theorems:'* 


I. The relativistic spacetime (W, g) contains incomplete null geodesics if 
it (i) satisfies the weak convergence condition and contains (ii) a 
noncompact Cauchy surface and (iii) a trapped surface. (Penrose 
(1965).) 

II. (W; g) contains incomplete timelike geodesics if it (i) satisfies the 
strong convergence condition and (ii) contains a compact spacelike 
hypersurface H without edge, such that (ii) the unit vectors 
orthogonal to H are everywhere converging (or everywhere diverg- 
ing) on H. (Hawking (1967), Theorem I.) 

Ill. (W, g) contains incomplete null and timelike geodesics if it (i) satisfies 
the strong convergence condition, (il) is strongly causal (in the sense 
defined on page 125), and (iii) there exists a worldpoint PEW such 
that all future-directed (or all past-directed) timelike geodesics 
through P are focused by curvature and start reconverging within a 
compact region in the future (respectively, the past) of P. (Hawking 
(1967), Theorem IT.) 

IV. (W, g) contains incomplete null and timelike geodesics if it (i) satisfies 
the strong convergence condition, (ii) is generic, (iii) does not have 
any closed timelike curves and (iv) contains at least one of the 
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following: (a) a compact spacelike hypersurface without edge, (b) a 
trapped surface, or (c) a worldpoint P such that all future-directed 
(or all past-directed) null geodesics through P are focused by 
curvature and start reconverging somewhere in the future (respect- 
ively, the past) of P. (Hawking and Penrose (1970), Corollary.)'® 


These theorems imply that among relativistic spacetimes geodesic incomplete- 
ness is at least as likely as geodesic completeness, in the following sense: If the 
set of Lorentz metrics on any given spacetime manifold is endowed with a 
reasonable topology, the metrics involving geodesic incompleteness (of the 
physically significant kind) constitute an open set.!'7 

If we examine the hypotheses on which the singularity theorems depend we 
see that some of them, like the presence of a trapped surface, concern local 
properties of spacetime, which might be gathered from observations. Others, 
such as the convergence conditions, though supposed to hold everywhere, can 
be extrapolated from local circumstances. But the theorems also postulate 
hypotheses ofa strictly global nature, such as the existence of a Cauchy surface, 
or the absence of closed timelike curves. Although these hypotheses tradition- 
ally have been taken for granted in physics, it does not seem possible to 
ascertain their likelihood by the standard methods of statistical inference. 
They cannot, however, be dismissed as irrelevant, for their denial runs counter 
to the same deeply ingrained physical intuitions or beliefs that apparently clash 
with geodesic incompleteness. Thus the total absence of Cauchy surfaces from 
the world is no less fatal to (global) scientific prediction than the presence of 
singularities. And closed timelike curves, along which the past is reached again 
and again via the future, so that any event on them can act on itself as its own 
remote cause, are probably harder to reconcile with received views than 
incomplete causal geodesics running as it were into or out of nowhere. (It is 
worth noting that recent work by Tipler (1976, 1977) suggests that closed 
timelike curves rather than preventing the occurrence of singularities would 
contribute to produce them.) 

The interpretation of spacetime singularities in terms of geodesic incom- 
pleteness, though decisive for establishing their comparative abundance in the 
set of solutions to the Einstein field equations, is now thought to be both too 
wide and too narrow. Some find it too wide because, as Misner (1963) has 
shown, there are compact Riemannian manifolds which are geodesically 
incomplete and it is felt that a compact manifold, which as such “can be 
covered by a finite number of well-behaved coordinate patches” (i.e. chart 
domains), must be regarded as singularity-free.’® A different interpretation has 
therefore been proposed, according to which a spacetime is singularity-free if 
and only if every geodesic defined on a finite interval maps it into a compact 
subset of the spacetime manifold. This criterion is not exempt from difficulty, 
however. Since a singularity-free world in this sense can be nevertheless 
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geodesically incomplete it may contain a test particle that emerges from or 
vanishes into nothingness—a phenomenon we have hitherto linked to the 
occurrence of a spacetime singularity.'? Anyway, the problem which the new 
criterion seeks to avoid is purely academic, for compact spacetimes always 
contain closed timelike curves and may therefore be dismissed as physically 
implausible—at any rate until there is some evidence that we happen to live in 
one. On the other hand, the concept of geodesic incompleteness is certainly too 
narrow for our purpose, since it fails to cover some spacetimes in which a 
worldline of matter is liable to run out of time. Thus R. Geroch (1968a) 
proposed an example of a geodesically complete relativistic spacetime 
containing an inextendible timelike curve of bounded acceleration and finite 
total length.2° Such a curve could be the worldline of an accelerated test 
particle that vanishes without trace in finite proper time. This difficulty has 
been duly taken care of by the more recent conceptions of the spacetime 
singularities to which I shall refer below. 

The study of singularities cannot aim only at showing that they must occur 
in this or that type of spacetime. It must also seek to clarify their relationship to 
each particular spacetime region. We would like to know how those regions 
are diversely affected by the existence of singularities, when particles are likely 
to vanish “into thin air”, where curvature increases beyond all bounds, etc. In 
other words, we wish to “localize” the singularities with respect to the ordinary 
worldpoints. The standard way of achieving this is to represent the former by 
ideal points, defined in terms of the spacetime structure, which are then 
gathered together with the “real” worldpoints in a conveniently structured 
manifold-with-boundary (cf. page 325, note 32). The first proposal of this kind 
was independently put forward by S. W. Hawking, in his unpublished Adams 
prize essay, “Singularities and the geometry of space-time”, of 1966, and by 
R. Geroch (19685). The following sketch of its main idea is paraphrased from 
Geroch. Each point P in the manifold Wand each vector v in the tangent space 
W, determine a unique geodesic with initial condition (P, v) (see p. 278). By 
slightly varying P and v we obtain a family of geodesics filling a world-tube. We 
call this tube a thickening of the geodesic (P, v). Two geodesics are equivalent if 
each of them enters into and remains within every thickening of the other. An 
ideal point is an equivalence class of incomplete geodesics. Though this idea 
sounds simple and natural, its rigorous definition involves some ambiguities 
which can only be resolved by more or less arbitrary decisions. Moreover, it 
rests squarely on the concept of geodesic incompleteness and may fail to 
provide ideal points corresponding to non-geodesic inextensible curves of 
finite length, such as we mentioned above. For these reasons, Hawking 
considered that his representation of spacetime singularities by equivalence 
classes of incomplete geodesics had been superseded by B. G. Schmidt (1971), 
whose elegant construction does not suffer from the said shortcomings.”' 

B. G. Schmidt’s method can be readily described by means of the concepts 
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explained in Parts B and C of the Appendix. Let FW denote the principal 
bundle of tetrads over the relativistic spacetime (W, g). On page 272 we show 
how to define 16 real-valued 1-forms w',on F W determined by the relevant 
Levi-Civita connection form w. Let 8 be the canonical form of FW (p. 272) and 
set 6! = 7;°6, where 7; is the i-th projection function on R*. We endow FW 
with a proper Riemannian metric h such that, for any pair of vector fields X and 
Y on FW, 


h(X, Y) = 0'(X)0'(Y) + wi(X)wi(Y) (6.4.1) 


Let FW denote the Cauchy completion of (FW, h). It is not hard to show that 
FW = FW and (FW,h) is metrically complete only if (W, g) is geodesically 
complete. The action of the structure group GL(4, R) on FW can be readily 
extended to an action on FW. Let W denote the set of orbits of the latter action. 
FW can be naturally regarded as a principal fibre bundle over W. Some of the 
points of Ware the orbits of the action of GL(4, R)on FW and can therefore be 
identified with the worldpoints of W. The remaining points of W are the ideal 
points representing the singularities of W. We call W the Schmidt completion 
of (W, g). From this point of view, (W, g) is singularity-free if and only if 
W —W = o. Since W hasa natural topology as a quotient space of FW the ideal 
points are automatically located with respect to the ordinary worldpoints.”? 
Although Schmidt’s treatment of spacetime singularities is wonderfully simple 
in its mathematical conception it is extremely difficult to work with it in 
specific cases. When the Schmidt completion of the expanding and recollaps- 
ing Friedmann world described in note 9 was finally computed by B. Bosshardt 
(1976), it turned out that the initial and the final singularity were represented 
by the same ideal point. Moreover, R. Johnson (1977) showed that the Schmidt 
completion of the Schwarzschild and the expanding and recollapsing 
Friedmann solutions includes an ideal point whose only open neighbourhood 
is the whole completed spacetime (so that W, in these cases, is not Hausdorff 
space). These findings have understandably cooled the enthusiasm for 
Schmidt’s invention. 

From a philosophical standpoint the most interesting method of adding 
ideal points to a relativistic spacetime is that proposed by Geroch, Kronheimer 
and Penrose (1972), for it rests exclusively on the spacetime’s causal structure. 
Though it can be applied to a wide range of (abstract) causal spaces, we shall 
concentrate, for brevity’s sake, on the case of a typical relativistic spacetime 
(W, g). Until the end of this Section, we shall mean by the past (future) of a 
point PeW the chronological past /~ (P) (respectively, the chronological 
future [* (P)) defined in Section 4.6. We must assume that (W,, g) is both past 
and future distinguishing, i.e. that for any two distinct points P, Qew, 
I~(P) # 17 (Q) and I*(P) # I*(Q). (Each worldpoint’s past and future are 
exclusively its own—a condition that, as the reader will recall, is not met by 
Newtonian spacetime.)?> A subset A < Wis said to be a past set if it is identical 
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with its own past, i.e. if A = 1 (A). (The past of a set was defined as the union 
of the pasts of its points.) A non-empty past set is said to be indecomposable, or 
an IP, if it cannot be represented as the union of two past sets distinct from 
itself. All IPs fall into one of two categories, namely, the point IPs or PIPs, 
which are identical with the past of some point PeW, and the terminal IPs or 
TIPs, which do not have this property. It is not hard to see that the past of each 
Pew isa PIP. Since ¥ is past-distinguishing it can therefore be identified with 
the set of its PIPs. We agree to view the TIPs as ideal points. (The union of W 
and its TIPs we denote by W.) The rationale of this move will now be made 
apparent. The past /~ (y)ofacurve y in Wis the union of the pasts of all points 
in the range of y. It can be proved that a subset of Wis an IP if and only if it is 
the past of a timelike curve.?* Let us say that Pe Wis the future endpoint of a 
timelike curve y if it is the supremum of the range of y in the partial order 
induced in ¥ by chronological precedence; i.e. if P chronologically follows 
every point of y (other than itself) and precedes every other point with this 
property. Ifa timelike curve y does not have a future endpoint we say that it is 
future endless. Evidently, an IP is a PIP if and only if it is the past of a 
timelike curve y with a future endpoint P, in which case I (y) = I (P). On the 
other hand, if an IP is the past of a future endless timelike curve y whose 
domain, when parametrized by proper time, extends to + 00, it is, as we shall 
say,a oo-TIP. A TIP that is not a oc-TIP is said to be singular. Obviously, any 
timelike curve y such that I” (y) is a singular TIP is future endless and yet 
attains only a finite length to the future of any given point y(t). The definitions 
of an indecomposable future set or IF, a PIF, a oc-TIF and a singular TIF 
proceed along similar lines and are left to the reader. The argument in note 24 
can be easily adapted to prove that the IFs are precisely the futures of the 
timelike curves of W. Any timelike curve y such that /*(y) is a singular TIF is 
past endless and yet has no more than a finite length to the past of any given 
point y(t). # can be identified with the set of its PIFs, while the TIFs are 
regarded as ideal points. The union of Wand its TIFs is denoted by W. Clearly 
WW OW. As each PIP is identified with a Pe¥W, it is also identified with a 
PIF (and each PIF witha PIP). Generally one would also wish to identify some 
TIPs with some TIFs. After the appropriate identifications are made, the union 
of # and W constitutes the completion W of W. The definition of a topology on 
W involves some niceties that we cannot discuss here. Penrose has shown thata 
relativistic spacetime that satisfies the hypotheses of Theorem IV on page 214 
contains a singular TIF or a singular TIP. 

A successful representation of the spacetime singularities by ideal points 
enmeshed with spacetime in a neighbourhood system entitles us to situate 
them ina definite place and time. We should however always bear in mind that 
such ideal points are not worldpoints, i.e. sites for possible events, but are 
conceived at an even higher level of abstraction. It ought to be clear also that in 
any of the above interpretations, the spacetime singularities may but need not 
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be associated to a wild behaviour of the curvature or the energy tensor in any 
particular region. Indeed a singularity can be compatible with the physical 
integrity of matter along every worldline leading to it. Recent research has 
sought to distinguish different types of singularities, wild and tame, and to 
clarify the circumstances of their occurrence.?> Among the fiercest are the 
“crushing singularities” defined by Eardley and Smarr (1979). In a spacetime 
endowed with a compact Cauchy surface, and in which all inextensible timelike 
geodesics are past (or respectively future) incomplete, the past (future) 
singularity is said to be crushing if there is a sequence (S,) of spacelike Cauchy 
surfaces whose mean curvature k, satisfies the relation 

max k,(P)— — 00 

PeS, 
as the S, approach the singularity. The expanding and recontracting 
Friedmann world described in note 9 subsists between two crushing 
singularities. 


CHAPTER 7 


Disputed Questions 


7.1. The Concept of Simultaneity 


In the Aristotelian philosophy that still shapes much of our common sense, 
the present time, called “the now” (to nun), separates and connects what has 
been (to parelthon) and what is yet to be (to mellon).' Though the now, as the 
link of time (sunekhei khronou), the bridge and boundary between past and 
future, is always the same formally, materially it is ever different (heteron kai 
heteron), for the states and occurrences on either side of it are continually 
changing. Indeed, the so-called flow or flight of time is nothing but the 
ceaseless transit of events across the now, from the future to the past. If an 
event takes some time, then, while it happens, the now so to speak cuts through 
it, dividing that part of it which is already gone from that which is still to come. 
Two events which are thus cleaved by (materially) the same now are said to be 
simultaneous. Simultaneity, defined in this way, is evidently reflexive and 
symmetric, but it is not transitive. For a somewhat lengthy event—e.g. the 
French Revolution—can be simultaneous with two shorter ones—such as 
Faraday’s birth in 1791 and Lavoisier’s death in 1794—although the latter are 
not simultaneous with each other. However, if we conceive simultaneity as a 
relation between (idealized) durationless events we automatically ensure that it 
is transitive and hence an equivalence. Thus conceived, simultaneity partitions 
the universe of such events into equivalence classes, and time’s flow can be 
readily thought of as the march of those classes in well-aligned squadrons past 
the now. The succession of events on a given object—a clock—linearly orders 
the classes to which those events belong. The linearly ordered quotient of the 
universe of events by simultaneity is what we mean by physical time. 

We saw in Chapter | how Newton’s idea of time is rooted in such ultimately 
Aristotelian notions, and in Chapter 3 we considered Einstein’s critical analysis 
of them in his first paper on Relativity. But we agreed to postpone the study of 
the philosophical problem raised by Einstein’s criticism until the present 
chapter. The question at issue can be roughly stated as follows: In the light of 
Einstein’s work, is simultaneity a natural, though frame-dependent relation, or 
is ita man-made relation, generated by our conventional method of labelling 
events with time coordinates? The first alternative sounds indeed paradoxical, 
for reference frames are human devices for the description of physical 
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processes, so that whatever depends on them is in a sense artificial. Frame- 
dependent simultaneity, which in effect amounts to a ternary relation between 
two events and a frame, must involve the conventional stipulations that define 
the latter. On the other hand, as we know, some types of frames and of 
coordinate systems adapted to them are better suited than others for the 
description of events set in certain spacetime structures. For example, the 
“comoving” coordinates used in equation (6.3.2) are especially adequate to a 
Friedmann universe, while Lorentz charts are naturally congenial to 
Minkowski spacetime. Since the physical world admits a Lorentz chart if and 
only if spacetime is Minkowskian, the mere possibility of defining such a chart 
would imply that the frame-dependent simultaneity defined by it is somehow 
natural, and rests on solid physical grounds. (It may be argued, of course, thata 
spacetime structure is itself a human fabrication, but that is another matter, 
that we shall consider in Section 7.2.) 

Philosophical discussions of simultaneity have been greatly influenced by 
Kant. According to the Critique of Pure Reason (1781), the meaning of 
temporal relations is directly grasped through our intuitive awareness of our 
own mental states, which are necessarily displayed in time. However, the 
objective time order of physical phenomena cannot be simply read off the 
subjective time order of sense appearances but must be constructed by 
subsuming the latter and the phenomenal objects which they manifest under 
the categories of substance-and-attribute, cause-and-effect, and interaction. In 
this way, the sudden dilatation of the air that we hear as thunder and the 
violent electric discharge that we see as lightning are understood to be 
objectively simultaneous (or very nearly so), although the perception of the 
former lags behind the perception of the latter. (It is worth noting that, on this 
view, the understanding, by enmeshing appearances in its categorical frame- 
work, also bestows objectivity on their subjective time order. Thus, according 
to the established laws of nature some time must elapse between the action of 
distant lightning on my retina and the action of simultaneous and equally 
distant thunder on my eardrum. This explains the time lag between my 
perceptions of these phenomena; the same time lag would be registered by 
electronic sensors placed near me.) According to Kant, the construction of the 
objective time order of events chiefly depends on two principles. By the 
Principle of Temporal Succession in accordance with the Law of Causality an 
event P can be said to precede another event Q if and only if, pursuant to a 
natural law, P is a cause of Q and Q is not a cause of P. By the Principle of 
Coexistence in accordance with the Law of Interaction or Community two things 
Aand Bcan be said to exist simultaneously in space if and only if they stand in 
thoroughgoing mutual interaction. If we assume, as Kant certainly did, that 
the temporal succession of events linearly orders their simultaneity classes, it is 
clear that two events P and Q are simultaneous if and only if neither precedes 
the other. The first Principle allows two situations in which this can obtain, 
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namely, (1) when neither P is a cause of Q nor Q isa cause of P, and (ii) when P is 
a cause of Q and Q is cause of P. Let us call situations (i) and (ii) the negative 
and the positive causal criterion of simultaneity, respectively. Situation (ii) may 
sound strange or even impossible, yet its reality is clearly suggested by Kant’s 
Principle of Coexistence. The mutual interaction between things A and B 
deserves to be called thoroughgoing (durchgdngig) only if the action of each 
responds to the very reaction it presently provokes in the other. This mode of 
involvement is typical of the mutual attraction of distant masses in a 
Newtonian world, which Kant very probably had in mind when he devised his 
conception of a community of substances held together by their “thorough- 
going interaction”. If, as in Relativity, no instant action at a distance is allowed, 
the positive causal criterion of simultaneity is of course inapplicable, and one 
must either forgo the Kantian approach to temporal relations or make do with 
the negative causal criterion. The latter criterion, however, can only yield a 
partition of events into simultaneity classes if, regardless of the distance 
between cause and effect, there is no positive lower bound to the time lag 
between them.” Consequently, in a relativistic world, where all causal action 
propagates with a speed equal to or less than that of light, we face the following 
options. Either (a) we give up the notion that simultaneity is an equivalence, 
and with it the standard view of time as a continuum—the quotient of events 
by simultaneity—linearly ordered by temporal succession; or (b) we introduce 
a purely conventional criterion of simultaneity; or (c) we seek a substitute for 
the Kantian approach to this matter. Philosophers who for one reason or other 
have ignored or avoided the third alternative have usually promoted a 
combination of the first two. 

Thus, the well-known explication of the concept of simultaneity proposed 
by Hans Reichenbach in the 1920s conjoins a natural definition based on the 
negative causal criterion with a rule for stipulating conventional definitions 
compatible with the former. Reichenbach gives a purely causal criterion of 
temporal succession: “If [an event] E, is the effect of [an event] E,, then E, is 
said to be later than E,”° If E, is not later than E, and E, is not later than E,, 
E, and E, are “indeterminate as to time order”. Reichenbach defines: 


Any two arbitrarily chosen events which are indeterminate as to time order are said to be 
simultaneous.> 


According to Reichenbach, this so-called “topological definition”’ expresses 


the deeper sense of the feeling that we attach to the concept “simultaneous”: two 
simultaneous events are so situated that no causal chain can run from any of them to the 
other. The events that are taking place at this very moment in a distant country cannot be 
influenced by me, not even by telegram; nor can they in turn exert any influence on what 
happens here right now. Simultaneity means disconnection of the causal link.® 


Due to the existence of “a finite limiting velocity for causal propagation”’ the 
natural relation of simultaneity characterized by the “topological definition” 
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does not partition the universe into classes of simultaneous events. Such a 
partition is however indispensable for establishing a global time variable. The 
physicist must therefore supplement the “topological” definition of simul- 
taneity by the negative causal criterion, with a convention fixing what Adolf 
Griinbaum has called the concept of “metrical simultaneity”.? The convention, 
however, cannot be wholly arbitrary, for the extension of the latter concept 
must be included in that of the former. (ie., if P and Q are “metrically” 
simultaneous they must also be “topologically” simultaneous.) 

Reichenbach set up a rule to which every such convention must conform. 
Reichenbach’s rule is formally stated as part of his axiom system for Special 
Relativity.” Although its wording is not specifically linked to this theory, it 
appears unlikely that the rule can serve its purpose and ensure the definability 
of a global time in a substantially larger context—e.g. in an arbitrary 
relativistic spacetime. I shall therefore regard it (like everybody else, by the 
way) as a rule for the definition of simultaneity of distant events in a universe 
governed by Special Relativity. (The physical contents of this theory can, as we 
know, be given a coordinate-free formulation which does not involve the use of 
any particular definition of simultaneity.) Restriction of the rule to Special 
Relativity inevitably curtails the philosophical significance of Reichenbach’s 
doctrine; but without this restriction it is far too indefinite to be argued for or 
against. We assume that each material particle constitutes—or is equipped 
with—a natural clock. The arrival time of the first signal sent from particle A to 
particle B at time ¢ is the g.l-b. of the times of arrival at B (as registered in B’s 
clock) of all signals that can be sent from A to B at time t¢ (as registered in A’s 
clock). A bouncing signal from A to Bisa signal sent from A to Band reflected 
back to A as soon as it arrives in B. The arrival time of the first bouncing signal 
sent from A to Bat t is the g.l.b. of the times of arrival at A (according to A’s 
clock) of all bouncing signals that can be sent from A to Bat t (according to A’s 
clock).!° Reichenbach’s rule for the definition of simultaneity can now be 
stated thus: 


Two events are simultaneous if they occur at the same time according to 
synchronous natural clocks at their respective locations. The clock at a 
particle A is synchronous with the clock at a particle B if the arrival time tg 
of the first signal sent from A to Bat time ty and the arrival time t, of the 
first bouncing signal sent from A to B at t, are related by 


tgp—to = E(t4—to) (7.1.1) 
where ¢ is a real number such that 
O<e<l (7.1.2) 


Reichenbach’s own formal statement of the rule adds that the coefficient ¢ 
must remain constant for each fixed A,as B ranges over all particles. In order to 
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define simultaneity we therefore need only to choose a value of the 
“Reichenbach parameter” ¢ in the allowed interval (0, 1). Restriction to this 
interval is both necessary and sufficient to ensure that the new conventional 
definition will be compatible with the natural, “topological” definition based 
on the negative causal criterion.!! 

According to Reichenbach, the definition of simultaneity between distant 
events in an inertial frame set forth in Einstein (1905d), §1, is merely an 
instance of the above rule, with e = 1/2.!2 I do not think that this statement, 
which has been repeated ad nauseam in the literature, can be accepted without 
qualification. For Reichenbach’s rule says nothing about the state of motion of 
the clocks being synchronized, while Einstein’s definition presupposes that 
they are both at rest in the same inertial frame. In fact, unless this 
presupposition is made, Reichenbach’s rule may not be consistently applicable. 
(Suppose, for instance, that the clock from which the bouncing signal is sent is 
affixed to the rim of a rotating disk.'3) We shall therefore narrow down the 
scope of Reichenbach’s rule to the definition of simultaneity in inertial 
frames.'* 

Let F be an inertial frame in which simultaneity is to be defined in 
accordance with Reichenbach’s rule. We are free to choose a Lorentz chart x 
adapted to F. By the “Lorentz image” of a set of events I shall mean its image 
by x in R*. Since Lorentz charts are diffeomorphisms and linear isomorphisms, 
the Lorentz image of a set exhibits the differentiable and affine properties of 
that set. As usual, we choose our units so that the Lorentz images of null lines 
intersect the Lorentz image of the worldline of any particle at rest in F at an 
angle of 45°. Let (A, t) denote an event that occurs on particle A at time t (by 
A’s clock). We now choose an admissible value of the Reichenbach parameter ¢ 
and use it for synchronizing all clocks at rest in F by means of bouncing signals 
emitted from a given “base particle” A. Pick out any time t and consider the set 
S, of events simultaneous with (A, t) by virtue of our stipulation. The Lorentz 
image of S, is a one-sheet hypercone with its vertex at x(A, t), each of whose 
generators makes an angle equal to arc tan(|1 —2e|) with its perpendicular 
projection on the hyperplane 7m) = x°(A,t). (Recall that 7» is the first 
projection function of R* onto R, which sends each real number quadruple to 
its first element. The stated results can be read off Fig. 2, which illustrates an 
application of Reichenbach’s rule with ¢ = 1/4. The three vertical lines 
represent the Lorentz images of the worldlines of the base particle A and two 
other particles, B and B’, at rest in the inertial frame F. The dotted lines 
represent the Lorentz images of the worldlines of first bouncing signals sent 
from A to B at ty and from A to B’ at to. The locus of the Lorentz images of 
events simultaneous with (A,tg) is therefore the hypercone generated by 
rotation about the Lorentz image of A’s worldline of the ray from x(A, tg) 
through x(B, tg). This ray is represented in the diagram, where it makes with 
the horizontal the angle 6 = arc tan (1/2) = arc tan (|! —2e|).) If ¢ = 1/2, the 
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Figure 2. 


hypercone degenerates and coincides with the hyperplane zy = x°(A, 2). If ¢ is 
less (resp. greater) than 1/2, the hypercone lies between the said hyperplane and 
the Lorentz image of the future (resp. past) light cone at (A, t). Thus, unless 
é= 1/2, the simultaneity classes defined by our stipulation will not be 
hyperplanes in Minkowski spacetime and inertial motions referred to our 
global time will not be uniform or even smooth. We are therefore forced to 
draw one of the following two conclusions: Either Reichenbach’s doctrine 
concerning the conventionality of simultaneity does not apply to inertial time 
scales (in Lange’s sense, page 17) and thus misses the whole point of Einstein’s 
discussion in §1 of “The Electrodynamics of Moving Bodies” (Section 3.2); or 
Reichenbach’s remark that the arbitrary factor « must remain constant for 
each choice of a base particle is not to be taken seriously, but should be 
modified in accordance with his declared intent of explicating Einstein. I opt 
for the latter alternative, which I find more charitable. We shall therefore 
assume that the global time produced by any instance of Reichenbach’s rule 
must constitute an inertial time scale, and we shall examine the conditions that 
this prescription imposes on the Reichenbach parameter e. 

The simultaneity classes into which an inertial time scale partitions 
Minkowski spacetime constitute a family of parallel spacelike hyperplanes. 
Each particular choice of the Reichenbach “parameter” ¢ is therefore in fact a 
smooth real-valued function of direction (in the space of the relevant inertial 
frame), which we shall henceforth call a Reichenbach function. If ¢(r) is the 
agreed value of the function in the direction of the vector r, e( —r) must equal 
1 —«(r). Since the range of each Reichenbach function ¢ is contained in the 
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interval (0, 1), it hasa supremum ¢,,,,,, equal to its value in a particular direction 
we shall denote by r,,a,, and an infimum épin = 1 — Emax = €(—Fmax). Every 
Reichenbach function takes the value 1/2 at the directions perpendicular to 
I'max- (Unless it is constant, these are the only directions at which it takes that 
value.) Consequently, a Reichenbach function is fully specified by the choice of 
its supremum ¢,,,, and the direction r,,,, at which it occurs, and may be 
denoted by (Emax: Fmax): 

The Reichenbach function e = const. = 1/2 determines the so-called stan- 
dard synchronism of distant clocks at rest in an inertial frame. (Clocks 
synchronized in this way keep Einstein time.) Clock synchronism in accord- 
ance with any other Reichenbach function is usually described as non- 
standard. There is an interesting relation between standard and non-standard 
synchronism, as we shall now see. Let F and F’ be inertial frames. By the speed 
of F’ in F I shall mean, as usual, the distance travelled in a unit of time by a 
point of F’, as measured by non-stressed rods and clocks in standard 
synchronism at rest in F. The partition of events into simultaneity classes in 
accordance with the non-standard synchronism determined in F by the 
Reichenbach function (may, Fmax) agrees with their partition according to 
standard synchronism in F’, if F’ moves in F in the direction of r,,,, With speed 
v = tanh (arc tan|1 —2émax{). (The reader can satisfy himself of the truth of this 
statement by comparing Fig. 2 on page 225 with Fig. 1 on page 97.) If each 
frame is densely packed with clocks in standard synchronism, all that we need 
to do to put the clocks of F in the non-standard synchronism described above 
is to set each one of them by the nearest clock of F’ when the latter shows a 
given time. The decision to change standard synchronism in an inertial frame F 
into some agreed form of non-standard synchronism (compatible with the 
Law of Inertia) amounts thus in effect to a decision to regulate clock 
synchronism in F by the standard synchronism proper to some other inertial 
frame. 

Before the discovery of radio waves, distant clocks were often synchronized 
by carefully and slowly carrying about one good clock by which all others were 
set. In his renowned treatise on The Mathematical Theory of Relativity 
(1923), A. S. Eddington showed that this procedure is substantially sound and 
agrees with standard synchronism in the sense that I shall now explain.'> Let 
us say that a clock Q, affixed to an inertial frame F is synchronized by clock 
transport at speed v witha clock Q, at rest in the same frame if'a clock travelling 
in straight line from Q, to Q, with constant speed v agrees with each as it passes 
by it. “Speed” can here be understood in the sense indicated in the preceding 
paragraph, but if we desire to avoid the dependence on a previous definition of 
synchronism by bouncing first signals, we can also understand it as self- 
measured speed, i.e. as the distance measured by unstressed rods at rest in F 
travelled in a unit of time measured by the travelling clock itself. If the clock at 
a given base point A is synchronized by clock transport at a fixed finite speed v 
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with every clock at rest along each ray drawn from A in the relative space of F, 
we obtain a global time function 7,. 7, does not agree with any global time 
defined in accordance with a Reichenbach function and does not constitute an 
inertial time scale. But the limit 7, to which 7, converges as v tends to 0 is an 
Einstein time for F, i.e. a global time defined by standard synchronism in F. T, 
is of course practically the same as 7) if v is very small in comparison with the 
speed of light. Eddington’s result is often expressed by saying that standard 
synchronism is equivalent to clock synchronization by infinitely slow clock 
transport.!° 

We obtain a spacetime chart by combining, as on page 54, a time 
coordinate function defined by standard or non-standard Reichenbach 
synchronism in an inertial frame F with a Cartesian coordinate system adapted 
to F. I call such a chart a Winnie chart, after John L. Winnie (1970). The time 
axis of a Winnie chart is not Minkowski-orthogonal to the hyperplane 
spanned by its three mutually orthogonal space axes—unless its time 
coordinate function has been set up by standard synchronism, in which case 
the Winnie chart is of course a Lorentz chart. If its time coordinate function 
reflects a non-standard synchronism, the restriction of a Winnie chart to a 
hypersurface of simultaneity is not an isometry of this hypersurface onto the 
Euclidean space {const.} x R®. The coordinate transformation y-x~! be- 
tween two Winnie charts x and y is what Winnie calls an ¢-Lorentz 
transformation; but I think it will be fairer and less misleading to call it a 
Winnie coordinate transformation.!’? By Zeeman’s Theorem (p. 127), the 
Winnie point transformation y~'-x cannot be a causal automorphism of 
Minkowski spacetime unless the Winnie charts x and y are in fact Lorentz 
charts. If the latter condition does not obtain, two deterministic processes 
whose initial conditions, referred respectively to x and y, read the same, can run 
wildly different courses. Winnie transformations do not constitute a group,’® 
so that it is probably pointless to seek for Winnie-invariants or for a Winnie- 
covariant formulation of the laws of nature. 

While Winnie charts thus make up a rather undistinguished subset of the 
maximal atlas of Minkowski spacetime, Lorentz charts are, one may say, made 
to order to fit its structure. The discovery of Lorentz charts and Lorentz 
transformations was prompted by experimental findings in optics and 
electrodynamics and led in turn to the formulation of new laws of mechanics 
that have since been brilliantly confirmed. Our study of Special Relativity in 
Chapters 3 and 4 (especially Sections 3.7 and 4.6) showed that not only inertial 
frames but also Lorentz charts enjoy a preferred status in that theory. This is 
not due to some particular bias introduced into the usual formulation of the 
theory by a convention on synchronism,!® but to the fact that the Lorentz 
group is the theory’s symmetry group. This, in turn, implies that the standard 
synchronism used 1n the definition of Lorentz charts is singled out by the very 
nature of things in a universe ruled by Special Relativity.”° 
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The doctrine of the conventionality of distant simultaneity in Special 
Relativity owes its resilience in part to a persistent belief in the positive and 
negative causal criteria as the only grounds on which a natural relation of 
simultaneity can rest.?! That this belief is false is most clearly seen in the 
context of General Relativity. As we noted in Section 6.3, events in a 
Friedmann world are partitioned into simultaneity classes in a unique, natural 
way by the global time coordinate of a “comoving” chart (see equation (6.3.2) ), 
which of course has no relation at all to Kant’s causal criteria. Insofar as the 
universe we live in is roughly a Friedmann world, it sports—in the large—a 
natural cosmic time that bestows a definite meaning on “the age of the 
universe”, “the first three minutes” of its existence, and other such phrases now 
common in scientific literature. On the other hand, there are relativistic worlds 
which cannot be partitioned into simultaneity classes in any way, neither 
naturally nor artificially, though they can certainly contain pairs of events that 
satisfy the negative causal criterion. In this connection, there is a theorem 
proved by Hawking (1969): a relativistic spacetime admits a universal time 
(and hence a partition into simultaneity classes) if and only if it is stably causal, 
in the sense defined on page 337, note 23. (Roughly: ifand only if the spacetime 
does not contain any closed timelike curves and no such curves will be 
generated by slightly expanding the null cone at each point.)?? The theorem 
shows that simultaneity has after all an essential relationship to the causal 
structure of relativistic spacetime, although a more subtle one than 
Reichenbach surmised. This relationship is particularly intimate if the 
spacetime is flat—i.e. Minkowskian—for in that case, by Zeeman’s theorem, 
the causal structure determines the metric up to a scale factor. Since the 
Minkowski metric, as we pointed out above, bestows a preferential status on 
the simultaneity relation determined by standard synchronism in an inertial 
frame, it cannot surprise us to learn that this relation can be defined in the 
Minkowski geometry in purely causal terms. As I mentioned on page 123, 
A. A. Robb (1914) and H. Mehlberg (1937) have produced axiom systems for 
Minkowski geometry in which the only primitive predicate—apart from 
“element” and “e”—has roughly the meaning of “chronologically follows” (in 
Robb’s system) and “is causally connectible with” (in Mehlberg’s). 
Simultaneity by standard synchronism is of course definable in both 
systems.?? In an updated English version of his system of 1937, Mehlberg 
(1966) used the word “connected” as his sole undefined term. This is intended 
to stand for the relation between two events E, and E,, on particles a, and a), 
when each event concurs with the collision, at different times, of a third particle 
a3 with its respective particle. In practice, one may substitute for a3 a series of 
successively colliding particles. Since the particles considered include photons, 
E, is connected in this sense with E, ifand only if E, # E, and E, lies either in 
the causal past or in the causal future of E,. Mehlberg uses the word “event” to 
designate the elements of the theory. The following definitions are transcribed 
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almost literally from Mehlberg (1966), p. 475. (I have only suppressed a few 
words I consider superfluous.) 
1. The event E occurs spatially between the unconnected events E’ and E” if every event 
connected with both E’ and E” is also connected with E. 
2. The events E, E’, and E” are collinear if one of them occurs spatially between the other 
two. 
3. An inertial frame of reference F is a partition of all events which satisfies the following 
conditions: 
a. If the events E and E’ belong to the same class of the frame F then E and E’ are 
unconnected. 
b. If the event E is not a member of a class t of F then E is connected with an event 
belonging to ¢. 
c. If the events E and E’ both belong to the class ¢ of F then event E” collinear with 
the aforementioned two events also belongs to t. 


The definition of simultaneity can be made extremely simple: Two events are 
simultaneous relative to an inertial frame F if and only if they belong to the same 
class of F. That this definition yields in fact the notion of Einstein simultaneity 
by standard synchronism can only be seen in the light of Mehlberg’s seven 
axioms. It is connectedness as characterized by these axioms that occurs in the 
definiens as a primitive term. The axioms can be briefly stated only with the aid 
of several additional definitions and cannot be properly understood without 
further comment. I must therefore refer the reader to Mehlberg’s writings.”* 

Although Mehlberg’s “Essai” of 1935/37 is duly mentioned in the well- 
known books by Adolf Griinbaum (1963) and Bas van Fraassen (1970), most 
philosophers of science did not realize that it was possible to define Einstein 
simultaneity in terms of causal connectibility until David Malament (1977) 
drove this point home in a short and trenchant paper. More importantly 
perhaps, Malament proved that simultaneity by standard synchronism in an 
inertial frame F is the only non-universal equivalence between events at 
different point of F that is definable (“in any sense of ‘definable’ no matter how 
weak”) in terms of causal connectibility alone, for a given F. (By a universal 
equivalence between events I mean a relation that holds between any two 
events in the universe.) Malament also showed that any equivalence between 
events that is definable in terms of causal connectibility alone, without 
specifying an inertial frame—in other words, any candidate to the status of a 
frame-independent or absolute simultaneity—is either universal or trivial. (By 
a trivial equivalence | mean a relation which everything has with itself and with 
nothing else.) Malament’s proofs apply of course only to a universe governed 
by Special Relativity. Indeed, their ease and brilliance results from what we 
may call, with mild abuse of language, the “arithmetization” of spacetime 
geometry by means ofa Lorentz chart. Let <p, g > denote the Minkowski inner 
product of two real number quadruples p and q (see equation (4.2.1) ). We write 
|p| for <p, p >. We say that two real number quadruples p and q are connectible 
if and only if |q—p| 20. Let .@ denote a universe governed by Special 
Relativity. Obviously, two events P and Q in .@ are causally connectible (in 
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the sense defined on page 121) if and only if, for any given Lorentz chart 
x: M—>R*, x(P) is connectible with x(Q). A Lorentz chart is thus an 
isomorphism between .@ as structured by causal connectibility, and R* as 
structured by connectibility. Consequently, a relation between events can be 
defined in terms of their causal connectibility in 4, if and only if its Lorentz 
image by a given Lorentz chart is definable in terms of connectibility in R*. 
Malament’s work turns on this simple truth. Since the isomorphism between 
M and R* defined by a Lorentz chart is not canonic it may seem that his 
argument is tainted by a surreptitious convention. But Malament’s argument 
does not depend on the particular choice of a Lorentz chart but only on the fact 
that such a choice is possible. This possibility lies at the heart of Special 
Relativity and has never been disputed by conventionalists. 

The upshot of our discussion of simultaneity may be summed up as follows. 
In the natural philosophy of Relativity, time relations between events are 
subordinate to and must be abstracted from their spacetime relations. A 
partition of the universe into simultaneity classes amounts to a decomposition 
of spacetime into disjoint spacelike hypersurfaces each one of which separates 
its complement into two disconnected components. (In this respect, each class 
of simultaneous events behaves in Relativity like the Aristotelian “now”.) Such 
a decomposition is impossible unless the universe is stably causal; but where 
one is possible, many more are available as well. Not all of them, however, will 
be equally significant from a physical point of view. Thus, in a universe that 
admits a Robertson-Walker line element, the hypersurfaces orthogonal to the 
worldflow of matter provide an unrivalled partition of events into natural 
simultaneity classes. In the Minkowski spacetime of Special Relativity, the 
hypersurfaces orthogonal to the congruence of timelike geodesics which 
makes up any given inertial frame define a partition of events that is naturally 
adapted to that frame. Simultaneity may thus be regarded as a unique physical 
relation in a Robertson—Walker world, while in Special Relativity one must 
put up with a three-parameter family of such physical relations. 
(Reichenbach’s rule, as normally understood, does no more than expand it toa 
six-parameter family by the cheap expedient of associating every inertial frame 
with the full three-parameter family of simultaneity relations adapted to each.) 
Ina more realistic world model things must be much less simple. It seems likely 
therefore that the concept of simultaneity can only claim physical significance 
locally, in the small neighbourhoods in which Special Relativity is valid, or very 
roughly, in the large scale in which the universe agrees with a 
Robertson—Walker model.?5 


7.2 Geometric Conventionalism. 


We have fought the alleged conventionality of simultaneity in Special 
Relativity invoking the structure of Minkowski spacetime. Our argument is 
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therefore seemingly powerless against those who believe that physical 
geometry itself is fixed by convention. This, however, as I said above, is a 
completely different question, that must be resolved on_ properly 
philosophical— e. universal, not theory-dependent—grounds. On this issue I 
contend that the conventionalist philosophy that puts the free institution of a 
system of geometry on a par with the free choice of a unit of measurement or a 
coordinate system which obtain their meaning from that geometry, confuses 
two very different senses in which freedom is at play in the enterprise of science. 
Take, for instance, the basic relativistic conception of the universe as a four- 
dimensional differentiable manifold with a Lorentz metric linked to the 
distribution of matter by the Einstein field equations. No one can doubt any 
longer that this idea was not imposed or even proposed by the observation 
records available circa 1912, but, like other such grand schemes for the 
investigation and understanding of physical phenomena, had to be freely 
produced. Einstein’s own awareness of this fact was expressed in his statement 
that the basic concepts and the fundamental laws of physics are not inductively 
gleaned from sense data but are “free inventions of the human spirit”.' The act 
of freedom from which such inventions arise is not, however, the arbitrary 
decision of a free will opting at no risk between several equally licit—though 
perhaps unequally convenient—alternatives set forth by a previously accepted 
conceptual framework. It is the venturesome commitment to a “way of 
worldmaking”, a plan for the intellection of nature, that will guide the scientist 
in his efforts, outline his routine tasks and options, and ultimately measure out 
his chances of success and failure. To say that such a plan is “adopted by 
convention”, as one might agree to use the Systeme International, merely blurs 
the contrast between two distinct levels of decision and hinders our 
understanding of the nature and scope of intellectual endeavour. 

That scientific experience is not just heaped up as it comes but is built by 
combining the data of sense according to the plan of reason, was first clearly 
stated by Kant. Although Kant knew full well that science has a history and 
even gave the exact dates at which several of its main branches came of age,” he 
apparently never thought that reason’s plan may undergo mutations, that the 
building of experience may have its very foundations redesigned while it is still 
being worked on. Indeed, he went as far as to say that a different plan, another 
system of forms and categories and goals, is not only utterly unthinkable, but 
would, in case we could grasp it, have nothing to do with experience, which is 
the only kind of knowledge through which objects are given us.*> Kant’s “pure 
theoretical reason” is therefore rightly viewed as an impassive, changeless, 
unhistorical power, that came to life fully armed like Athene out of her father’s 
head; and although Kant was wont to describe it as a bottomless abyss, whose 
elements and structure can in no way be accounted for,* few have ventured to 
think of it as the outcome or the exercise of inscrutable freedom. Kant 
wondered “why we have precisely this kind of sensibility and an understanding 
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of such nature, that through their combination experience becomes possible”; 
why moreover these two “otherwise totally heterogeneous sources of knowl- 
edge always concur in such good harmony to bring about [... ] the 
possibility of an experience of nature [ .. . ], as if nature had been expressly 
designed for us to grasp”.> But he did not ask himself what would happen if in 
some field of knowledge the incoming data become less and less pliable to the 
plan of reason, if our efforts to weave them, following the established pattern, 
into the unfinished tapestry of experience begin to yield diminishing returns. 
That was, however, the state of affairs in theoretical physics a century after 
Kant’s death, and the question that he forgot to ask was answered de facto 
within the next decades. All the cherished principles that Kant had judged 
indispensable for the construction of experience—mass conservation, causal 
determinism, instantaneous distant interaction, the continuity of intensive 
quantities, the axioms of Euclidean geometry—were discarded and replaced 
by a tentative, imperfectly clear, not altogether consistent, but dramatically 
fruitful new set of basic concepts and laws.°® 

Philosophers offered diverse accounts of the breakdown of Kantian reason. 
The shallowest readily believed that by means of cleverly designed experiments 
and observations-—e.g. the Michelson~Morley interferometer experiment of 
1887, the observations of the 1919 solar eclipse by the astronomers of the 
British expedition—scientists had proved the old principles wrong and their 
substitutes right. The conventionalist account was subtler. It granted that the 
mute occurrences around us can only become empirical facts through being 
conceived in terms of an intellectual system; that sense appearances have to be 
spelled out by the understanding in order to be read as scientific experience.’ 
But reason’s “spelling” is no less subject to revision than ordinary ortho- 
graphy. Since, like the latter, it has to be accorded freely, the suggestion is 
inevitable that it is also a matter of convenience only. 

For a less metaphorical statement of these views we turn to Reichenbach’s 
book on The Theory of Relativity and A Priori Knowledge (1920).® Taking his 
cue from Hilbert’s Foundations of Geometry, Reichenbach favours the 
axiomatic formulation of physical theories. He notes, however, that there is an 
important difference between axiomatized mathematical theories such as the 
one put forward in Hilbert’s book, and axiomatized physical theories. While 
the former characterize their undefined terms—the primitives—in a series of 
unproved propositions—the axioms—the latter must go one step further in 
order to put the conceptual structure determined by the axioms in connection 
with reality. According to Reichenbach this is achieved by means of 
stipulations that coordinate primitive or defined terms of the theory with this 
or the other aspect of reality. Thus a two-place predicate of the theory might be 
coordinated with a binary relation between physical events, etc. Moritz Schlick 
(1918) had asserted that coordination (Zuordnung) is the only kind of relation 
that can be established by the human mind.’ Reichenbach remarks that there is 
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a great difference between the coordination of two concepts or conceptual 
objects, as practiced in mathematics, and the coordination of a concept with 
reality, as required by physics. In the former case—e.g. when defining a 
function from real numbers to real number triples—the coordinated objects 
are defined and known independently of the coordination. It may happen that 
by coordinating elements of one set with those of another the latter are 
endowed with new structural properties and relations (which are then said to 
be “induced” by the former). But according to Reichenbach even in such cases 
both sets must be given as collections of distinct elements prior to the 
coordination. The coordination of concepts with reality in empirical science 
must proceed very differently. For “only the coordination of the concept 
defines what is an individual thing in the ‘continuum’ of reality; and only the 
conceptual network decides, on the basis of perceptions, whether an individual 
thing that is envisaged in thought ‘exists out there in reality’.”!° Empirical 
knowledge rests therefore on a coordination between two sets, “one of which is 
not merely ordered through the coordination but is actually defined—has its 
elements determined—by the coordination”.'! “The coordination must procure 
itself one of the sequences of elements to be coordinated.”!? As a consequence 
of this some of the axioms of a physical theory must play a special role. They 
are the axioms that characterize the primitive terms coordinated with 
otherwise undefined elements of reality. Such “principles of coordination 
[ . . . ] define the individual elements of reality and are in this sense constitutive 
of the real object. In Kant’s words: “Because only by means of them can any 
object of experience be conceived’.”! Not every axiomatic system, however, 
provides admissible rules of coordination. 


Although the coordination maps to undefined elements, it can take place only in a perfectly 
definite way and not arbitrarily. We call this, the determination of knowledge by experience. 
We notice the curious fact that it is the defined side that determines the individual things of 
the undefined side, and that vice versa it is the undefined side that prescribes the order of the 
defined side. This reciprocity of coordination expresses the existence of reality.'* 


” 


Admissible or “correct” coordinations can be distinguished from “incorrect 
ones because only the former are “consistent”. Inconsistencies are discovered 
by observation. “The theory that continually leads to consistent coordinations 
is called true.”'> 

According to Reichenbach the gist of Kant’s rationalism lies in the claim 
that a fixed set of constitutive principles, allegedly prescribed by human 
reason, are guaranteed to yield consistent coordinations forever. If Kant’s 
claim were tenable all empirical knowledge of actual physical objects could be 
expected to conform with the said set of principles and these would be a source 
of a priori knowledge of the physical world. But, says Reichenbach, the Theory 
of Relativity has proved that Kant’s claim is false. Einstein has shown that the 
interpretation of certain experimental results—such as Michelson and 
Morley’s—cannot be consistently carried out in the light of the standard 
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principles of classical physics, which include some that Kant regarded as a 
priori true. Kant was right in stressing the need for constitutive principles, 
which are reason’s contribution to knowledge. But he was wrong in 
maintaining that such principles are immutable and cannot be affected by the 
growth of experience. “The contribution of reason is not expressed by the fact 
that the system of coordination contains unchanging elements, but by the fact that 
arbitrary elements occur in it.”!® However, such arbitrary elements cannot be 
chosen at random but must be adjusted to yield consistent descriptions of 
observed facts. Since inconsistencies turn up as knowledge increases, the 
constitutive principles of empirical knowledge are subject to historical change. 
Every change in them “brings about a change in the concepts of thing and 
event”. 


And herein lies the difference between Kant’s view and ours: whereas for him only the 
determination of each particular concept is an infinite task, we contend that even our 
concepts of the object of science as such, of the real and of how it can be determined, can only 
gradually become more precise.!” 


So far, so good. Some of Reichenbach’s notions, like that of “consistent 
coordination”, and indeed the entire process by which the concept singles out 
its empirical correlate from the vast mass of what Reichenbach blandly calls 
“reality”, would certainly have deserved some further elucidation. But we 
cannot accuse Reichenbach (1920) of mixing up the free commitment to a 
constitutive principle with the arbitrary choice of one convention or other. 
Since the former involves the risk of failure through “inconsistency”, it is not, 
like the latter, indifferent to truth. Though not true in itself in any ordinary 
sense, a successful system of coordinating principles is the critical factor in 
transmuting the stubborn, silent presence of “reality” into a manifestation of 
empirical truth. Nevertheless, when Reichenbach (1924) went on to show in 
detail how the Theory of Relativity ought to be construed in accordance with 
his epistemology, he promoted from the outset the very confusion that we here 
regard as typical of conventionalism. Coordination is now to be achieved by 
means of “definitions”. The difference between mathematical and physical 
definitions is described as follows: 


Mathematical definition is conceptual definition [ Begriffsdefinition], i.e. it defines concepts by 
concepts. Physical definition assumes the relevant concept to be already defined and 
coordinates with it an actual thing [ein Ding der Wirklichkeit]; it is a coordinative definition 
[Zuordnungsdefinition]. Physical definition consists therefore in the coordination of a 
mathematical definition with a “piece of reality” [zu einem “Stick Realitdt'’].'® 


Reichenbach’s own example of a coordinative definition is the 1889 conven- 
tion by which the unit of length was identified with the Paris metre rod. The 
coordinative definitions proposed in his book do not have quite the same 
character as this example, for they do not coordinate a concept of the theory 
with an actual thing or event, but rather with a class of physical phenomena. In 
order to be recognized as such the latter has of course to be subsumed under an 
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earlier concept, pertaining to another theory, or even perhaps to what 
Reichenbach calls the “theory of everyday life”.'? Thus, for instance, 
Definition 9, which says that “light rays are straight lines”, coordinates the 
mathematical concept of a straight line not with one or more individual things 
or processes, but with the entire class of light rays, and must therefore rely on 
geometrical optics or on sheer common sense for the criteria of appurtenance 
to this class.2° Coordination as actually practised by Reichenbach can thus 
hardly be said to “define the individual elements of reality”. The coordinative 
definitions of his Axiomatics (1924) are more in the nature of dictionary entries 
that translate some terms of the theory into plain German. This reinforces the 
view of them as conventions, arbitrarily chosen from a stock of preconceived 
alternatives. It also suggests. erroneously that the conceptual framework of a 
scientific theory can be readily decomposed into elementary ingredients liable 
to be revised and replaced one by one. 

In the Philosophy of Space and Time (1928) Reichenbach further emphasizes 
the view that physics grows in piecemeal fashion, by successive acts of 
coordination. Such acts, by which one merely states that “such and such a 
concept is coordinated to this thing here”, cannot be replaced by “explanations 
of meaning” (inhaltliche Bestimmungen).?! Generally, a coordination is not 
arbitrary, but rather, due to the fact that the concepts involved have 
interrelated meanings (da die Begriffe untereinander inhaltlich verflochten sind), 
it can turn out to be true or false, if one requires that “the same concept shall 
always denote the same thing”. 


But certain preliminary coordinations must be established before the coordination 
procedure can be carried through any further. These first coordinations are therefore 
definitions, which are appositely called coordinative definitions. They are arbitrary, like all 
definitions. On their choice depends the conceptual system that develops with the progress 
of knowledge.?? 


Arbitrary coordinations constitute therefore the very first steps of a science. 
Thereafter, admissible coordinations are restricted by those already agreed on, 
or rather by the articulation the latter bestow on the actual course of 
phenomena. 

Reichenbach’s first and best known example of the role of arbitrary 
coordinative definitions in science concerns the foundations of physical 
geometry. According to him the distance function of physical space must be 
empirically ascertained by measurements performed with rigid rods, i.e. with 
solid bodies of unchanging size and shape. If the span between two places A 
and Bcan be covered precisely by n rigid rods of length k, lying end to end, and 
not by any number less than n, the distance from A to Bis nk. The proposition 
clearly depends on what we mean by “a rigid rod of length k”. This, in turn, is 
the outcome of two arbitrary conventions. Obviously, we must fix the unit of 
length. In Reichenbach’s time it was defined as the distance, at 0°C, between 
two marks on a particular metal rod kept in Paris. As I mentioned earlier, in 
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1924 Reichenbach had used precisely this example to illustrate his idea of a 
coordinative definition.?? But the truly decisive convention that lies at the 
foundation of physical geometry according to Reichenbach is the definition of 
the rigid rod itself. What bodies shall we repute invariable as to shape and size? 
One is tempted to answer: none at all. For we know that every sizable piece of 
matter will change its shape under our very eyes if subjected to sufficiently 
powerful forces. This common sense remark gives the cue to Reichenbach’s 
treatment of the question. By “force” he understands anything that is held 
responsible for a “geometrical change”, i.e., I presume, a change of size or 
shape.?* It seems therefore that we might define a rigid rod as a body isolated 
from the deforming influence of forces. However, according to Reichenbach, 
there are two kinds of forces. Differential forces act differently on different 
materials and their influence can be warded off by shielding. By comparing 
their effect on different bodies or under diverse degrees of insulation we can 
construct with increasing exactitude the ideal norm of a body totally 
impervious to such forces and correct by this standard the deformations which 
they inflict on real bodies. On the other hand, universal forces “act equally on 
all materials” and “there are no insulating walls against them”.?° 
Consequently, the deformations caused by universal forces are undetectable in 
principle, for they produce identical changes on all bodies, so that their effects 
cannot be quantified by comparison and corrected by calculation. In the light 
of his classification of forces, Reichenbach opted for the following definition: 

Rigid bodies are solid bodies which are not subjected to any differential forces, or from 


which the influence of differential forces has been corrected away by calculation. Universal 
forces are disregarded.”° 


Measurements performed with solid bodies conforming to this criterion will 
yield, according to Reichenbach, a distance function for physical space. The 
results of such measurements obviously depend on a convention, namely, our 
agreement to “set universal forces equal to zero” and fully ignore their effects. 
However, once this convention is adopted, “the geometry obtained is 
prescribed by objective reality alone”.”’ But it is always possible to obtain a 
different geometry by postulating a suitable field of universal forces. 
Reichenbach refers to this possibility as “the relativity of geometry”. 
Later in the book, when he comes to deal with “Gravitation and Geometry” 
(§ 40), Reichenbach declares that gravity is “the prototype of a universal 
force”.?® For, he says, it acts on all bodies in the same way. And, of course, there 
is no shielding against it. Hence, if we follow the convention proposed above, 
we ought to put the gravitational force everywhere equal to zero. Reichenbach 
implies that this was what Einstein did in his theory of gravity. I quote: 
Thus, gravity exerts on measuring instruments [ Messkérper] such a uniform [einheitliche] 
action, that the latter jointly define a unique geometry. We can abide by it. In this sense, 
gravity is geometrized. We do not speak of an alteration of the measuring instruments by the 


gravitational field, but regard the measuring instruments as “free from deforming forces” in 
spite of the influence of gravity.?° 
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The obvious suggestion is that one could still avoid Einstein’s unorthodox 
geometry by assigning a suitable non-constant value to the universal force of 
gravity. 

I cannot agree with this reading of General Relativity. It is of course very 
unsatisfactory to speak, as we have been doing, of the force of gravity as if it were 
a definite entity, existing out there in rerum natura, to which different theories 
refer invariably, while assigning to it different values. In fact, the truly 
remarkable relation between Newton’s theory of gravity and Einstein’s is that 
both assign practically the same quantitative outcomes to nearly all feasible 
experiments, though they postulate very different structures to account for the 
phenomena in their common purview. But no matter what we take to stand for 
“the force of gravity” in either theory, it is not a “universal force” in 
Reichenbach’s sense. 

Thus, we may not conclude that Newtonian attraction acts in the same way 
on all bodies just because they all fall with the same acceleration at a given 
potential. For the force exerted per unit volume on different bodies is not the 
same; as we see at once when we try to balance a gallon of mercury with a 
gallon of oil. If equal gravitational forces act on the mercury and on the oil—as 
would happen, for instance, if they lie at suitable different potentials—the 
effect on each will obviously be quite different. And a given gravitational force 
certainly does not cause the same deformations on all bodies, irrespective of 
the material they are made of. Otherwise, scale manufacturers could achieve 
substantial savings by using springs made of crepe paper instead of steel. 

Einstein’s theory of gravity is not easily expounded in terms of force—which 
is not the same as to say that it sets the force of gravity equal to zero. The 
nearest substitute we can find here for the classical gravitational field strength 
(equal, at a point, to classical force per unit mass) are the connection 
components I’j,, which Einstein once called “the components of the gravi- 
tational field”.°° As we know, these quantities are chart dependent, and hence 
ultimately of small physical significance. If referred to a Fermi chart, they all 
vanish along the chart’s time axis. Since the latter can be the worldline of any 
massive particle, we may, by the choice of a suitable chart, set the “components 
of the gravitational field” on such a particle equal to zero. But this has precious 
little to do with Reichenbach’s “disregard of universal forces” in his definition 
of the rigid rod. For the Ii, do not generally vanish on all the domain of the 
chosen chart, but show up as tidal forces on any extended body that includes 
the particle in question. And of course such tidal forces do not deform all 
materials alike, but cause different “geometrical changes” on rubber, say, and 
on glass. 

Another candidate for the title of “gravitational force” in General Relativity 
is the worldforce on a freely falling particle. (By worldforce 1 mean the 
generalization to a vector tangent to an arbitrary spacetime of the 4-force 
defined on Minkowski spacetime in (4.5.2).) Worldforces on particles can be 
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held responsible for the “geometrical change” from a geodesic to a non- 
geodesic worldline, and one should in principle be able to refer to them the 
deformations of extended bodies that Reichenbach had in mind when he spoke 
of universal and differential forces. But it would be wrong to assume that 
General Relativity puts all gravitational worldforces, all worldforces on free- 
falling particles, equal to zero. Such world-forces do indeed vanish invariantly 
in the case of a non-spinning monopole of constant rest mass, for a freely 
falling particle of this kind has a geodesic worldline along which its 
worldvelocity is self-parallel. But they do not generally vanish in the case of a 
spinning particle or, generally, a multipole.*' 

If, in spite of Reichenbach’s allegations, gravity cannot be properly 
described as a “universal force”, one is hard put to find an example within the 
extension of this term. The truth is that a “universal force”, a force which 
cannot be detected by any means because it modifies the shape of the 
instruments of detection in the exact way required to conceal its presence, 
belongs to the realm of science fiction and cannot be seriously countenanced in 
real science. At least in physics, a difference to be a difference must make a 
difference. Even the notorious Fitzgerald—Lorentz contraction was sup- 
posedly the effect of changes in molecular forces that a clever experiment might 
test (see p. 292, note 32). While Christian theology has long known of changes 
of substance which no outward feature will disclose, it was not until the Rise of 
Scientific Philosophy that we got to hear of a change of shape which nothing 
can ever make apparent.>? 

Reichenbach’s misconstrual of gravity as a universal force is not the only 
reason why his philosophy of geometry is inadequate for understanding 
General Relativity. Another shortcoming is that it was originally designed for 
the study of space without time and is ill-suited to cope with spacetime. 
Reichenbach is obeisant to Hermann von Helmholtz’s conception of geometry 
as the theory of congruence of rigid bodies, and it is not easy to see how this 
notoriously narrow conception can be extended to encompass spacetime 
geometry with a Lorentz metric. In fact it will not even fit the needs of proper 
Riemannian space geometry. Helmholtz’s fixation on measurement by rigidly 
transported rods prevented him from understanding Riemann’s idea of metric 
relations ina manifold in which “every line has a length independent of the way 
it lies, and therefore any line can measure any other line”. But without this idea, 
and the concomitant shift of attention from bodies and the volumes they fill, to 
points and the paths that join them, the development of spacetime geometry 
would not have been possible. It follows from a remark of Riemann’s that free 
mobility of rigid bodies can only exist in a proper Riemannian manifold of 
constant curvature. Because of this mathematical fact, it was almost unani- 
mously believed before Einstein that physical geometry had to be chosen 
among the four Riemannian geometries of constant curvature, namely, 
Euclidean, Bolyai-Lobachevsky, spherical and elliptic geometry. (Note 8 


DISPUTED QUESTIONS 239 


explains why this belief lent strong support to geometric conventionalism.) As 
Lord Russell once observed, the advent of General Relativity made this 
opinion look “somewhat foolish”.2? Reichenbach knew of course that there 
can be no free transport of rigid rods in a typical relativistic world. He 
therefore interpreted the fundamental coordinative definition of geometry as 
applying in practice not to rigid rods, but to infinitesimal rigid rods. Although 
the concept of an infinitesimal measuring rod seems at first sight paradoxical— 
for, indeed, if the length of a rod is negligible, of what use can it be for 
measuring?—there is a sense in which practically infinitesimal measuring rods 
have an important role to play in General Relativity. But, as we shall see, 
infinitesimal rods in this sense are not coordinatively defined standards by 
which to judge the geometry of physical space. On the contrary, they 
themselves must be constructed and controlled in accordance with the 
infinitesimal geometry to which General Relativity is committed. Roughly 
speaking, an infinitesimal measuring rod is a freely falling solid body of 
negligible thickness that is short enough not to be significantly stressed and 
strained by the inhomogeneities of the gravitational field. What is “short 
enough” will naturally depend on the allowable error and on the local 
spacetime curvature. The beams of a spacecraft that were infinitesimal as it left 
the Solar System may cease to be so if it approaches a Black Hole. Though 
inherently inexact, the concept of an infinitesimal rod can be given an exact 
foundation in differential geometry. Let P be any point of the arbitrary 
relativistic spacetime (W, g). The exponential mapping Exp p, defined on page 
278, maps a neighbourhood of the zero vector in the tangent space W> 
diffeomorphically onto a neighbourhood of P. On a usually smaller neigh- 
bourhood of zero the said diffeomorphism differs from an isometry by less 
than an assigned error. We shall denote the latter neighbourhood by Np. 
Choose a timelike vector v € Np, and let Dpdenote the set of vectors in Np which 
are orthogonal to v. Expp(Dp) is then a smooth spacelike hypersurface 
through P, which satisfies, within the assigned error, the relevant propositions 
of Euclidean geometry. Figures in Exp p(Dp) are indeed sets of instantaneous 
events, but we can protract them to yield bodies by means of a simple device. 
Let ) be the one and only geodesic in W such that y(0) = Pand jp = v (p. 278). 
Define No and Dg as above, for each point Q in the range of y. As on page 278 
we let TPQ denote parallel transport along y from P to Q. We shall consider only 
a short segment ’ of y, containing P, such that for any point Q in y’, Ny differs 
but little from tpg(Np). Let N be the intersection of the family { tp(No)|Qisa 
point in y'}. (N is a subset of Np.) Let D be the intersection of N with 
Dp. Take any 3-dimensional figure F c Expp(D). Put F’ = Expp!'(F), 
Fo= tpg (F’) and Fo= Expo(F 9). As Q ranges over )’, Fo describes a 
worldtube in ¥. Such a worldtube is the history of a small body in the 
relativistic world (W, g). The body obeys, in an obvious sense, the laws of 
Euclidean geometry. If F is a thin parallellepiped, a solid body derived from it 
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in the stated manner would be what I described as an infinitesimal rigid rod. 
Three such rods meeting orthogonally at a common endpoint whose worldline 
is y’ provide the frame for the Cartesian coordinates of a local Lorentz chart. 
The possibility of erecting such a frame along any timelike geodesic bears 
witness to the local (approximate) validity of Euclidean space geometry near 
each freely falling particle. It is the local Euclidean geometry that decides 
which objects do and which do not qualify as infinitesimal measuring rods at 
each place.3* Its validity evidently is not determined by measurements 
performed with rods, but flows as a mathematical theorem from the 
assumption that spacetime is a Riemannian manifold. 

Thus, the foundations of physical geometry do not rest for Relativity on the 
fact that rigid bodies are freely movable—as Helmholtz argued; nor—as 
Reichenbach claimed—on a convention to ignore universal forces; but-—as 
Riemann anticipated—on hypotheses that lay at the heart of the scheme of 
things by which we seek to judge and understand the course of events. Two are 
the basic hypotheses: 


I. Spacetime—i.e. the “arena” or stage of physical events—is a 4- 
dimensional differentiable manifold. 

II. The phenomena of gravitation and inertia—.e. the effects attributed by 
Newton to the “impressed force” of gravity and the “innate force” of 
inertia—are to be accounted for by a symmetric (0, 2) tensor field g on 
the spacetime manifold, which is linked to the distribution of matter by 
the Einstein field equations and defines a Minkowski inner product on 
each tangent space. (The condition last mentioned says that g is a 
Lorentz metric, and therefore determines a unique connection on the 
manifold, with its characteristic curvature and set of geodesics. The 
field equations entail that, in the absence of non-gravitational forces, 
the worldlines of the simplest massive particles are timelike geodesics 
of that set. This accounts for the classical gravitational phenomena of 
free fall, tidal strain, and planetary motion, and predicts new 
phenomena, such as the recession of the galaxies and gravitational 
collapse. The Minkowski inner product on each tangent space 
induces-—through the exponential mapping—a local approximate 
Minkowski geometry on a small neighbourhood of each worldpoint. 
This accounts for the Lorentz invariance of the laws of nature referred 
to local Lorentz charts. By thus deriving the local Minkowski 
geometries that explain the successes of Special Relativity from the 
global metric tensor g, the hypothesis accounts for the equivalence of 
inertial motion and free fall in a homogeneous gravitational field— 
implicit in Newton’s handling of the Earth-Moon system—and 
predicts the optical effects of gravity.) 


Hypothesis I is common to all extant theories of mathematical physics. It is 
inconceivable that a computer programmed for inductive inference could ever 
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derive it from crude observational data. Nevertheless, it is not an arbitrary, 
harmless convention, but a truly cosmopoietic principle which has hitherto 
guided the interpretation of all experimental results and shaped the factual 
system that physics has built from them. 

Hypothesis II is characteristic of General Relativity. We studied its 
development and motivation in Chapter 5. There are two ways of describing its 
geometrical import. Insofar as the tensor field g is in effect a Riemannian 
metric on the spacetime manifold, we may say that Hypothesis II explains 
gravity by geometry. We should bear in mind, however, that “gravity” stands 
here for gravitational effects, and “geometry” is not being used in its ordinary, 
pre-Minkowskian meaning. In a still wider sense, every concept of physics is 
“geometrical”—as Einstein once pointed out to Lincoln Barnett—and the 
gravitational field g is neither more nor less a “geometric object” than, say, the 
electromagnetic field tensor.*> On the other hand, insofar as the tensor field g 
assigns to each worldpoint an inner product on the respective tangent space, 
which in turn induces in a neighbouring region a (roughly) Minkowskian 
geometry from which a Euclidean geometry on parallel Einstein-simultaneous 
spacelike sections can be readily derived, we may say that Hypothesis II 
explains geometry by gravity. Here “geometry” may be understood in its old 
familiar sense, while “gravity” denotes the cause of the so-called gravitational 
phenomena. The behaviour of unstressed rods in a local inertial frame is now 
to be counted among the latter, insofar as it is also accounted for by the tensor 
field g. 

By grounding geometry (and chronometry)—as practised in scientific 
laboratories and factories of precision instruments—on a physical field 
governed by testable natural laws, General Relativity escaped the trilemma of 
apriorism, empiricism and conventionalism in which the epistemology of 
geometry was caught at the turn of the century.°° It seems to me, therefore, 
that Reichenbach puts the cart before the horse when he says that, as “it turns 
out that the non-Euclidean geometry obtained thanks to our coordinative 
definition of the rigid rod deviates quantitatively very little from Euclidean 
geometry when small regions are concerned, |. . .] we may therefore employ 
the following method of inference: Under the assumption that Euclidean 
geometry is valid locally, we can prove that a non-Euclidean geometry holds 
globally, which everywhere converges to the infinitesimal Euclidean 
geometry”.*’ In General Relativity the inference goes the other way around: 
Under the assumption that spacetime is Riemannian (specifically: Lorentzian) 
we can prove that a flat (specifically: Minkowskian) geometry must hold 
locally—to a first approximation. This flat geometry is the standard by which 
our measuring instruments are judged. The latter can indeed be used to test 
Hypothesis II, e.g. by measuring the components of the actual spacetime 
curvature as manifested through gravitational strains.3® But Hypothesis II is 
not inferred from such measurements; for it must be posited—no less than 
Hypothesis I—in order to make sense of them.*? 
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In our generation geometric conventionalism has been kept alive mainly by 
Adolf Griinbaum’s acumen and perseverance. Griinbaum’s conventionalism 
does not respond, like Reichenbach’s, to some alleged necessity of the scientific 
method, but purportedly follows from the actual nature of the subject matter 
of physical geometry, namely, real space and time. Griinbaum’s prose is 
renowned for its complexity, and he has often complained that his critics miss 
his meaning due to inaccurate reading. Rather than risk this reproach, I 
renounce in advance all claim to give here a faithful picture of the historic 
Griinbaum (which anyway would probably require another volume like this 
one). I shall comment only on the conventionalist doctrine that is usually 
associated with his name. References to Griinbaum’s writings are only 
intended asa guide to the reader who wishes to pursue the matter on his own.*° 

The main tenet of what we shall call, for short, GGC or Grinbaum 
Geometric Conventionalism, is that physical space and time do not possess a 
metric by virtue of their own structure, and can therefore only obtain one 
“extrinsically”, by arbitrary stipulations regarding the congruence of segments 
and intervals. By a metric on a set S Griinbaum means either what he calls a D- 
metric, i.e. a distance function from S? to R, or an M-metric, i.e. a measure 
function from 2° to R, or an R-metric (R for Riemann), ie. a certain kind of 
cross-section of the bundle of symmetric (0, 2)-tensors over S (which obviously 
can exist only if S is a differentiable manifold).*! I take it that all three 
meanings of meiric are intended in the following early statement of GGC: 


The continuity we postulate for physical space and time furnishes a sufficient condition for 
their intrinsic metrical amorphousness.*? 


This statement is baffling, for it entails that intrinsic metric amorphousness is a 
necessary condition of continuity, whence it follows, by contraposition, that 
every structure to which a metric, in any of the above senses, pertains 
intrinsically, is not continuous. And yet we know that the standard distance 
function d:R? > R, by (a, b)-+|a—b|, induces the topology of the linear 
continuum in the field of real numbers. Our counterexample can be dismissed, 
however, if “the continuity we postulate for physical space and time” differs in 
some essential respect from the (admittedly paradigmatic) continuity of the 
real line. Indeed, the “continuity” in question would have to be quite peculiar, 
and differ from the continuity of all topological manifolds; for otherwise a 
structure (M, g), where M is an n-manifold and g a Riemannian metric on M, 
would be, according to GGC, a contradiction in terms. 
A few years later the main tenet of GGC was restated as follows: 


Be it topological or nontopological, there exists no kind of implicit (intrinsic) basis for 
nontrivial metric relations in continuous P-space. [ P for physical; } assume that a similar 
statement applies to physical time or “T-space”—R.T.]*? 
The new formulation does not say that the continuity of space—which, no 
matter how narrowly we conceive it, is always, I presume, a topological 
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property—is sufficient to vouchsafe its material amorphousness; but that all 
properties of space, be they topological or not, are insufficient to make it 
metrically definite. (Such insufficiency can be invoked to explain why the 
metric of space must then be determined by choice among the many metrics 
compatible with space’s own properties.) Since the properties of space are not 
listed, the second statement of GGC is not, like the first, a proposition that can 
be proved or disproved by analyzing its terms, but an injunction not to include 
in the intension of the concept of physical space any properties that entail that 
space has a metric. 

Our task will be made easier if, instead of inquiring into the somewhat 
elusive reasons for this remarkable injunction,** we ask at once about its 
relevance to General Relativity. For if it turns out that Einstein’s theory 
mindlessly ignores it, we may still indeed look forward to the eventual 
discovery of a plausible theory of gravity more conformable to the enjoined 
requirement, but we need not concern ourselves any further with GGC in the 
present book. Now, the fundamental structure posited by General Relativity is 
neither space, nor time, but spacetime, which is, of course, by definition, a 
4-manifold W endowed with a Riemannian metric g. The following quotation 
shows that, at least at the time it was written, Griinbaum admitted that g, in 
General Relativity, is not a matter of convention, but an essential ingredient of 
the theory’s concept of spacetime. (Here “ds,” denotes the line element 
determined by the spacetime metric g as per (4.3.15). ) 


The space-time measure ds, is an invariant both as between different reference frames and 
with respect to the use of different space and time coordinates in any given frame. It follows 
that two space-time intervals which are congruent in one reference frame in virtue of having 
equal measures ds, in that frame will likewise have equal measures ds, and hence will also be 
congruent in any other frame. Accordingly, there are no alternative infinitesimal space-time 
congruences in the GTR [i.e. General Relativity—R.T.] and the latter kinds of congruence 
are thus unique.** 


It follows that the concept of a relativistic spacetime does not obey the 
injunction of GGC. But the latter, apparently, was never meant to apply to 
physical spacetime, but only to physical space and time. According to 
Grinbaum, 


the uniqueness of the four-dimensional space-time congruences of the GTR [cannot] 
gainsay or detract from the significance of the following claim of mine: the GTR actually 
employs alternative congruences in each of the physically important continua of time and 
space in precisely the sense countenanced philosophically by [GGC]. It is now plain that 
nothing about my claim that the insight contained in [GGC] has substantial relevance to the 
GTR requires that there be alternative space-time congruences in addition to the undeniably 
present alternative time congruences and alternative space congruences.*® 


There is no denying that, if it is at all possible to carve a space and a time out of 
a relativistic spacetime, it can be done in countless ways. Any foliation of the 
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spacetime W into spacelike hypersurfaces defines a time; any congruence of 
timelike worldlines in W defines a space.*’ In a few highly streamlined 
spacetimes some such congruences and foliations are in a sense inbuilt into the 
worldstructure, as we saw in Section 7.1. But in more realistic models there are 
no natural grounds for preferring one method of splitting spacetime into space 
and time over the many other available alternatives. And in the most general 
case, no congruences of timelike worldlines or foliations into spacelike 
hypersurfaces exist, and space and time are meaningless even as abstractions. 
The space and the time determined by a given congruence-cum-foliation are, 
respectively, a 3-manifold and a 1-manifold, which can therefore be endowed 
with proper Riemannian metrics. Following Griinbaum, we shall denote the 
line element of a typical space metric by ds3, and that of a typical time metric by 
ds,. In some special cases, there is a ds, and a ds, uniquely determined by the 
spacetime metric in accordance with the particular structure of the relevant 
congruence-cum-foliation. Thus, if the line element of spacetime can be put 
into the Robertson—Walker form (ds,)? = dt? — (S(t))*do?, the fibres of the 
time coordinate function t furnish the leaves of a suitable foliation, while 1’s 
parametric lines constitute the matching congruence. The proper Riemannian 
line elements ds, = dt, ds; = do, are then clearly prescribed (up to a 
scale factor) by ds,.*% Also, if the spacetime is flat, with line element 
(ds,)’ = n,;dx'dx/, we have a congruence-cum-foliation determined, as in the 
former example, by the time coordinate function x°, and natural line elements 
ds3 = /(—Ngdx*dx*) and ds, = dx° for the resulting space and time. The 
existence of such natural metrics in these cases obviously has to do with the fact 
that the timelike separation between any two leaves of the foliation is the same 
at each point of either, while the spacelike separation between any two lines of 
the congruence is constant or varies uniformly with time. These conditions are 
not fulfilled in less special cases, in which, therefore, the given spacetime line 
element ds, does not determine (up to trivial rescaling) a ds, and a ds,. The 
space and time metrics are then entirely left to our choice, but, for this very 
reason, no matter how we choose them, they will not be physically significant. 
Now, as Graham Nerlich has remarked, if the metric is an imposed convention, 
and factually idle, it is quite inscrutable why we should choose one at all.*? It 
would seem therefore that GGC has nothing to teach us about General 
Relativity. 

In the latest and, in my opinion, clearest statement of his views on geometry, 
Griinbaum (1977) contends that the metric of a relativistic spacetime is not 
given by the mere existence of the latter, but is “constituted”, as he says, by 
“external devices”. Although he does not attempt to derive a conventionalist 
conclusion from this premise (at any rate, he does not imply that there could be 
a viable substitute for the metric in question if the factual claims of General 
Relativity are valid), it will be instructive to examine Griinbaum’s argument. As 
he is not easy to paraphrase, I shall quote him abundantly. Griinbaum does not 
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dispute that any given relativistic spacetime is characterized by a particular 
solution to Einstein’s field equations, i.e. a unique symmetric (0, 2)-tensor field, 
representing the gravitational field. But, he says, 


if a physical field f; , is a symmetric, second-rank, covariant tensor field over space-time, this 
fact alone surely does not guarantee that f;,; must have the physical significance of being the 
Metric tensor of space-time: after all, qua physical tensor field, the stress-energy tensor field 
T,,, for example, has the same formal properties as f,, while certainly not being the metric 
tensor of space-time.°° 


Griinbaum forgets to mention that, by hypothesis, a particular solution to 
Einstein’s field equations defines a Minkowski inner product on every tangent 
space of space-time. Due to this “formal property”, which sets it apart from all 
non-gravitational tensor fields of current mathematical physics—including, of 
course, the stress-energy tensor field mentioned by Griinbaum—a solution to 
Einstein’s field equations is what we call a Lorentz metric. This name is not just 
one more caprice of mathematical terminology, but is meant to hint at the 
unavoidable chronogeometrical implications of a physical tensor field of this 
kind. If the gravitational field is adequately represented by a Lorentz metric it 
cannot fail to have the strictly geometrical effects that General Relativity 
ascribes to it. Having omitted to say as much, Grtinbaum proceeds to ask: 


If g,; is a solution set of Einstein’s equations under specified physical conditions, and thus 
represents some kind of symmetric, covariant, second-rank tensor field over space-time, 
what warrants the particular interpretation that the scalar (g,,dx‘dx/)'/? also has the specific 
physical significance of being the infinitesimal measure for intervals of the physical 
space-time manifold? [...] What qualifies the tensor g,;—qua solution of Einstein's field 
equations—to do double duty physically by representing not only a physical field but also the 
metric tensor of space-time?*! 


The following quotation gives the gist of Grinbaum’s reply: 


What is otherwise just another symmetric, covariant, second-rank physical tensor field g;; 
does control the physical space-time trajectories of photons and freely falling (“free”) 
massive particles, as well as the rates of atomic clocks and perhaps the world-lines of other 
(as yet unknown) entities as well. And at least in NE-GQ space-time [i.e. “non-empty GTR 
spacetimes that are, however, devoid of gravitational radiation”—R.T. ], it is the idealized 
behavior of this particular important class of external entities that is taken to be canonical 
for the metricality of these space—-times in the sense that it is fundamentally constitutive 
ontologically for their metric structures. Thus what is otherwise just a physical tensor field 
gi; thereby acquires specifically geometric, additional, physical significance as being the 
metric tensor of the space-time manifold. And in NE—GQ space-times, g,; acquires this 
geometric significance because—and only because—the scalar |gydxidx!|!? generated 
from that field is (ideally) physically realized by the behavior of that open-ended set of 
external devices that are taken to be canonical for the metric structure of space—time.°? 


We encounter here a curious combination of philosophical realism (with 
regard to the existence of the gravitational field and the differentiable manifold 
on which it is defined) and vestigial operationism (with regard to the metrical 
significance of that field). Lack of time and space prevents me from examining 
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its rationale. (Howard Stein has done it splendidly in his incisive Letter to 
Adolf Griinbaum on Space-Time and Ontology (19775).) But there are a few 
questions I would like to raise before closing this Section. The “constitution” 
of the spacetime metric is attributed by Griinbaum to “external devices”, 
members of a “family of concordant external metric standards” (FCS) which 
includes “photons, atomic clocks, infinitesimal rods, and free massive (gravi- 
tationally monopole) particles”.>? Why are these physical entities described as 
external? After all, they are not something that supervenes upon the relativistic 
spacetime and the gravitational field that characterizes it, but are part and 
parcel of the matter distribution that generates that field. The contribution to 
the energy tensor of matter of the FCS members actually used in our 
laboratories is surely negligible; but, unless I badly misread Griinbaum, his 
FCS includes every photon and atomic clock, not only those that we have 
managed to harness. Now, if the FCS comprises virtually every source of 
gravity—so that, for example, it is the relative abundance of FCS members ina 
Friedmann world that decides whether it will end up in conflagration or in 
thermal death—what sense does it make to say that the gravitational field 
acquires from the FCS an additional significance it does not have by itself? The 
real gravitational field g is what it is by virtue of the actual matter distribution 
T. But, for this very reason, once g is fixed, there is nothing else that the 
matching T can add to it. Griinbaum distinguishes between the field g qua 
metric field and the same g qua “pregeometric gravitational field”.** To me it is 
as if in the formulation of Newton’s theory of gravity one carefully demarcated 
the Galilean force, by which earthly things fall down, from the Keplerian force, 
that keeps heavenly bodies in orbit. Then, a mere gallon of fuel added to a 
ballistic missile would suffice, at the margin, to “constitute” as a proper 
Keplerian force what is otherwise just a pre-Keplerian Galilean force on the 
missile. Griinbaum’s distinction is of less practical consequence, because all 
feasible tests of the gravitational field g must use FCS members as probes. 
Should we ever succeed in detecting gravitational waves and learn from them 
something about the metric structure of the world, a philosopher might still 
argue that the properties thereby disclosed are “geometric” only by virtue of the 
ontologically constitutive intervention of the FCS members excited by the 
waves, but “purely gravitational” in the light of the waves alone. In fact, 
Grunbaum has already served us warning that, if the null geodesic trajectories 
of gravitational radiation in an empty spacetime (T = 0) were declared to be 
not just “gravitationally null”, but “geometrically null” as well, “with a view to 
establishing thereby the geometric physical significance of the g,, tensor, this 
would be physically tantamount to no more than just calling this tensor 
‘metric’.”** This remark leads to my last question. If spacetime is a 4-manifold 
it is—whether we perceive it or not—criss-crossed by curves. If the gravi- 
tational field g is defined on the manifold W, there is a unique symmetric linear 
connection V in W such that Vg = 0. A curve y in W is a geodesic of the 
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connection if and only if, at every point of y, V5 = 0. If this condition is met, y 
is of course energy-critical on every closed subinterval of its domain, ie. it is a 
Riemannian geodesic (of g). All this follows solely from the “formal 
mathematical properties” of W, g and y, whatever the make-up of the physical 
entities on y’s path. Shall we say that y is “geometrically geodesic” only if it is 
the worldline of a gravitating particle, but that it is merely “gravitationally 
geodesic” if it runs across a perfect vacuum? Even those who revel in such 
fine distinctions will find something very odd about the terms chosen to 
express this one, for the curve is a real gravitational trajectory only in the 
former case, while in the other it is precisely what anyone would call a purely 
geometrical line. But there is no question that you can make words mean so 
many different things, and I, for one, will not grudge a philosopher’s claim to 
be their master.*° 


7.3 Remarks on Time and Causality 


Common sense ideas of space and time are tightly knit together with our 
ordinary notions and experiences of causality. We all know that—other things 
being equal—it is easier to change what is near at hand than what resides far 
away; and, until the advent of radiotelecommunications, distance was a serious 
obstacle to the prompt fulfilment of our wishes. Agents such as light and 
gravity, that can influence very distant objects very fast, are also extremely 
weak; while those means of action which keep their full strength as they travel 
from their source—e.g. a bomb mailed from one country to another to blow up 
a political exile—are slow to reach their goals. Before Einstein, however, 
nobody appears to have seriously disputed that any two events might be 
causally related to each other, regardless of their spatial and temporal distance. 
The denial of this seemingly modest statement is perhaps the deepest 
innovation in natural philosophy brought about by Relativity. It has 
completely upset our traditional views of time, space and causality, as the 
following remarks will illustrate. 

In a relativistic spacetime two events can be causally related only if their 
respective worldpoints are joined by what we have called a causal curve, i.e. a 
smooth curve that is nowhere tangent to a spacelike vector. In this way, the 
spacetime metric strictly regulates what events can and cannot influence or be 
influenced by a given event. In the notation of Section 4.6, if the spacetime 
manifold is denoted by W;, the set of all worldpoints joined to a given PEW by 
causal curves is J (P), that is, the union of P’s “causal future” and “causal past”. 
The complement of J (P)in Wis the set || (P) of worldpoints “separate” from P 
(p. 121). For some worldpoints P in some spacetimes (W,, g), || (P) may indeed 
be empty. But, as we shall see, in such spacetimes one cannot properly define a 
time, and causal relations have quite unusual features. Thus, in a relativistic 
world where some events could be casually connected with all the rest, our 
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familiar notions would be fully inapplicable. And, of course, in the more 
familiar case of Special Relativity—where (#, g) = (.@,n)—we have that, for 
each worldpoint P, every neighbourhood of P has a non-empty intersection 
with ||(P). This means that, in the nearest proximity of any event, other events 
take place which cannot affect it or be affected by it. This alone is sufficient to 
shatter the received views that sustain our common talk and even shape our 
grammar. 

We speak a tensed language, that neatly distinguishes between past and 
future, and, somewhat less neatly, between them and the present. Our use of 
tenses is presumably the source of Aristotle’s conception of the ‘‘now”’ (p. 220). 
But our tensed speech is hardly appropriate for dealing with the idealized 
instantaneous events that are essential to the mathematical physicist’s 
handling of Aristotelian time.While we can and do refer with all the required 
exactness to such punctual events in the past and future tenses, all our 
statements in the present tense are necessarily about states and processes that 
last some time. “The bell rang at 9:37:15 A.M.” may refer with split-second 
precision to the time at which the first electric pulse that set the bell ringing 
stopped a chronometer that was connected to the bell’s circuit for this purpose. 
(The numbers can even make allowance for any known time lag between the 
current’s arrival at each instrument.) “The bell is ringing now” cannot refer to 
an instantaneous event, but only to a process that has already begun and will 
still, no matter how briefly, go on. (If the bell stops ringing while I speak, the 
statement in the present tense is false and must be corrected by reformulating it 
in the past tense.) If we try to introduce mathematical exactness into ordinary 
language by restricting the scope of the present tense to the Aristotelian now, 
we are led to the paradoxical conclusion that all that actually exists is nothing 
but the vanishing boundary between that which was and is no longer and that 
which will be and is not yet—a classical instance of philosophical self- 
entrapment through misuse of language.’ Obviously, our common usage was 
not designed to serve the purposes and needs of physics. Quite justifiably, no 
attempt has been made to adapt it to the advances of physical knowledge. We 
say in good English, ina clear night, that such-and-sucha star “is shining”, even 
though, for all we know, it could have been annihilated four or more years ago. 
To speak otherwise, to say, for example, pointing at the splendid pattern of 
Orion, “Look! Rigel was shining while Frederick Barbarossa was on his fourth 
expedition to Italy!”? would not only be preposterously pedantic, but would, 
from a relativistic standpoint, raise ticklish questions regarding our choice ofa 
particular dating system. For assuming—to avoid even greater compli- 
cations-—that spacetime is Minkowskian, there 1s no reason to refer all events 
to the inertial system in which our dear selves happen to be momentarily at 
rest. We can also choose a Lorentz chart whose origin coincides with our 
statement, relative to which the light we see left Rigel ata much more recent 
date.? 
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The muddles in which one may get entangled by naively stretching to Special 
Relativity the standard use of the present tense are very well exemplified by the 
doctrine of chronogeometrical determinism put forward by C. W. Rietdijk 
(1966, 1976) and H. Putnam (1967). The doctrine recalls the logical determin- 
ism that is often attributed to the Stoics. According to the latter, every 
proposition has a truth value, true or false. Consequently, it is argued, the 
future is fully determined by the true propositions that describe it and I am 
not free to change it. To counter this facile argument, some libertarian 
philosophers deny that all propositions have a truth value. Contingent (i.e. 
non-necessary) propositions about the future acquire whatever truth value 
they will have as time goes by. This thesis is obviously tailored to fit the 
Aristotelian idea of time:* the “now” is the omnipresent catalyser that activates 
the transmutation of indeterminate propositions into truth or falsehood. The 
thesis is therefore hard to reconcile with the relativistic view of time. But there 
is another, subtler answer to the logical determinist’s ploy: that every 
proposition p is necessarily either true or false does not entail that p is either 
necessarily true or necessarily false.> The future will be what it will be, but this 
does not mean that it is already necessitated to be that way, regardless of what 
might happen between now and then. Truth-functional logic does indeed 
require every fact to be definite, but not that it be defined by other facts that 
precede it. An act of freedom is described by a proposition that is always true, 
but utterly groundless, and thus impossible to predict according to general 
laws from the true description of earlier facts. This second reply to logical 
determinism, which was worked out by medieval philosophers bent on 
reconciling human freedom with divine foreknowledge, is independent of any 
particular conception of time, and is therefore preferable to the former froma 
relativistic standpoint. It will also be helpful in dispelling the confusions of 
chronogeometrical determinism. 

Rietdijk’s and Putnam’s arguments rest entirely on the following fact: If A 
and B are two events joined by a timelike worldline in Minkowski spacetime, 
there is an event C which is Einstein simultaneous with A in one inertial frame 
and with B in another.® In particular, if A is an event in my life, here and now, 
and Bis an event in my future, there is an event C which in my own momentary 
inertial rest frame (mirf) is happening right now and yet takes place in 
someone else’s mirf at the same time as B. What happens now is real (Putnam) 
and determinate (Rietdijk). Thus, C is no less real and determinate than A, for 
both occur now (in my mirf); and Bis no less real and determinate than C, for 
both occur now (in another frame). Since the reality and determinateness of an 
event does not depend on the inertial frame to which it is referred, Bis just as 
real and determinate as A. Therefore 1 may conclude, much like the logical 
determinist, that even as I live through A, I am no longer free to shape B. 

The argument shifts without warning from the present tense of ordinary 
conversation—appropriately used for stating what happens now in my or your 
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frame—-to the tenseless present of science—needed for making absolute 
ontological assertions. (Unless there is an absolute now, as in the Aristotelian 
world, there can be no frame-independent use of the present tense.) Putnam 
assumes that “All (and only) things that exist now are real”. Here “exist”, 
modified by the adverb “now”, is present tense; but “are”, as in “are real”, is 
tenseless present.” But how can that which exists now, relatively to an observer 
and his chosen frame, be coextensive with the real? Or should we understand 
Putnam to mean only that “all things which exist now are real now”—real, that 
is, in (and only in) the relative space of the inertial frame to which the now is 
referred? But this interpretation ruins the argument, which depends, as we saw, 
on the absoluteness (frame-independence) of reality. Each event is (tenselessly) 
real and determinate, in this absolute sense, at its own worldpoint. No tensed, 
frame-dependent statement can add to or detract from its reality and 
determinacy. In our particular example, C, in order to be Einstein simul- 
taneous in different frames with A and B, must be joined to each of these events 
by a spacelike geodesic. C is therefore separate from both A and B, whose 
reality and determinacy is thus in no way conditioned by that of C. Whether B 
is caused by A or is totally unaffected by it is a question to which the relation 
that each has to C cannot have the slightest relevance. 

In order to be determined when and where it happens, an event need not be 
pre-determined. The latter predicate is properly asserted of an event E only if E 
is (in principle) predictable from its own past, ie. from the set J ~(E)— {E}. 
But there is, of course, no way of proving by geometry alone that every event in 
a Minkowskian universe is thus predetermined. Rietdijk, who claims that the 
pre-determination of all such events is entailed by the argument explained 
above, defines “predetermined” in a different way: 


We say that an event P is (pre-)determined if, for any possible observer W, (that is, for all 
possible observers, and even for all other, e.g., physical, instances), who has P in his absolute 
future (that is, that the future part of W,’s course through the four-dimensional continuum 
may eventually pass through P), we can think of a possible observer W, (or: there may exist 
an observer W,) who can prove, at a certain moment 7,, that W, could not possibly have 
influenced event P in an arbitrary way (e.g., have prevented P) at any moment when P still 
was future, or was present, for W,, supposed that W, did desire to do so.® 


To see whether Rietdijk’s claim follows from this definition, let us apply it to 
the example we considered above. Let B be the arbitrary event P of which we 
want to know whether it is predetermined in Rietdijk’s sense. LetW, denote an 
inertial observer who has B in his future as he undergoes event A. Let W, be 
another inertial observer, in whose frame B is Einstein simultaneous with C, an 
event that is Einstein simultaneous with A in the frame of W,. Is there a time Tp 
in which W, can prove, from the information provided, “that W, could not 
possibly have influenced event P in an arbitrary way at any moment when P 
still was future, or was present, for W,, supposed that W, did desire to do so”? I 
assume, though the definition does not say so, that the time 7, is to be some 
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(Einstein) time in the frame of W,. (It would be uncharitable to regard 7, as an 
absolute time, unwittingly smuggled into the argument.) Under the stated 
circumstances, W, can prove something about B if he is acquainted with B. 
Consequently, 7, cannot be earlier than the time—call it Tg—at which C and B 
occur inW,’s frame. W, will know Bat that time only if B lies on W,’s worldline. 
In that case, W,’s history from A to B lies at Tz in W,’s past, and W, cannot 
prove from purely chronogeometrical considerations that Bis not the outcome 
of an arbitrary influence exerted through that history. If W,’s worldline does 
not pass through B,W, can only learn about B some time later than 73. At that 
time, again, W,’s history before B will lie in W,’s past and W, will be unable to 
prove, without additional, more substantive information, that B does not 
result from a free decision taken by W, at or before the time of B. The fact that 
W,, as he finally gets to know B, may also learn that in his frame B is 
simultaneous with an event C, which is simultaneous with A in W,’s frame, 
cannot modify our conclusion. Thus, even if we adopt Rietdijk’s contorted 
definition of predetermination, we may not maintain, on the sole strength of 
Minkowski geometry, that every event is predetermined in a flat relativistic 
spacetime. 

Turning now to the consideration of time and causality in an arbitrary 
relativistic spacetime (W,, g), we note first ofall that every worldpoint Pe Whas 
a neighbourhood that mimics the causal structure of an open submanifold of 
Minkowski spacetime. For the exponential mapping Expp maps the null, 
spacelike and timelike vectors in a neighbourhood of zero in the tangent space 
at P respectively into null, spacelike and timelike geodesics through P. By 
designating one of the two sheets of the null cone in the tangent space Wp as 
“future”, we can classify all causal curves through P into future-directed and 
past-directed. This classification constitutes what we may call a local time 
orientation at P. The spacetime (W,, g) is said to be time-orientable if such a 
classification can be smoothly extended to all ¥, i.e. if a local time orientation 
can be established at each Pe W which agrees smoothly with the local time 
orientations at all neighbouring points.’ Evidently, (W, g) is time-orientable if 
and only if # admits a smooth field of timelike vectors. (If X is such a field, we 
may choose the local time orientation at each worldpoint so that the local value 
of X is future-directed. Since X is smooth, the time orientation will be 
smoothly extended throughout W-) But there are other, more handy criteria of 
time-orientability. Let us say that a curve y in W is time preserving if every 
vector parallel along y to an arbitrarily chosen future-directed vector v at a 
point of y is also future directed. (The reader ought to satisfy himself that the 
property of being time preserving does not depend on the choice of the vector 
v;in other words, that if any future-directed vector changes to past-directed by 
parallel transport along y all its neighbours in the interior of the future null 
cone at its point of origin suffer the same mutation.) The relativistic spacetime 
(W, g) is time-orientable if and only if every closed (smooth or piecewise 
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smooth) curve through a fixed, arbitrarily chosen worldpoint P is- time 
preserving.’° It can be shown moreover that if a closed curve y through a given 
PeW is time preserving, every other curve through P, homotopic to y, is also 
time preserving.'' It follows that the relativistic spacetime (W, g) is time- 
orientable if—though not only if—the manifold is simply connected, i.e. if 
every closed curve in W is homotopic to a point. (Any worldpoint may be 
regarded as the image of some real interval by a constant mapping }; y is then 
trivially a time preserving closed curve.) This result is quite significant, for 
every relativistic spacetime (W, g) has a universal covering space (W, 8), Le. a 
covering space that is simply connected.'? (W, %) is constructed as follows. 
Choose a point Pe W. For each point Q € Wwe consider all curves from P to Q. 
If two such curves are homotopic we say that they belong to the same 
homotopy class (see note 11). The set of all pairs (Q, [y2]), such that Qe Wand 
[y2] is an homotopy class of curves from P to Q, is a covering space over W, 
with the covering map f (Q, [y2])> Q (see note 12). We denote it by W. W has 
a unique differentiable structure relative to which fis smooth. The pull-back of 
g by f, f*g, is a Lorentz metric on W, which we denote by g. The 
knowledgeable reader will note that, by the construction of W, the funda- 
mental group of W (the so-called first homotopy group 7, (W, P)) acts freely 
on W, its action being transitive on each fibre of the covering map f. If W is 
simply connected all curves from P to a given point Q are homotopic and there 
is only one homotopy class [2] to each QeEW, f is therefore bijective and 
indeed a diffeomorphism, and we may identify (W, g) with its own universal 
covering space. But even if W is not simply connected, every PEW has a 
neighbourhood diffeomorphic to a neighbourhood of each of the multiple 
elements of the fibre of fover P, and it is impossible to distinguish between the 
spacetimes (W, g) and (W, %) by means of local experiments. Since every 
experiment is local—including the collection of light on photographic film 
placed at the receiving end of a telescope—we can safely assume that the 
spacetime we live in is simply connected and hence time-orientable. (Note 
however that if our spacetime were the universal covering space of a multiply- 
connected spacetime, it would contain several identical copies of a neighbour- 
hood of each worldpoint, so that every individual event, process, thing or 
person in our experience would be exactly repeated at least twice—indeed as 
many times as there are homotopy classes of closed curves through a given 
point of W. This vexing iteration could then be suppressed at a stroke by 
identifying our spacetime with the multiply connected spacetime in question, 
rather than with its universal covering space.)'? 

Time order and causal influence would make no sense in a relativistic 
spacetime that is not time-orientable. But even in one which is time-orientable 
they may have only a local significance. The relations of causal and 
chronological precedence (Section 4.6) can be formally introduced in a time- 
orientable relativistic spacetime (W,, g) by choosing a global time orientation 
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and stipulating that, for any pair of worldpoints P and Q, P causally precedes 
Q(P < Q)ifand only if there is a future-directed causal curve from P to Q; and 
that P chronologically precedes Q(P <Q) if and only if there is a future- 
directed timelike curve from P to Q. But (W, <, <) might not bea causal space 
(p. 123), for it may well contain closed causal curves. As a matter of fact, several 
exact solutions of the Einstein field equations, such as the Gédel, the Taub- 
NUT and the extended Kerr solutions, are known to contain such curves.'* An 
event in the range of a closed causal curve y would causally precede and be 
preceded by all other events in y’s range, so that it could be both their cause and 
their effect. As this would be utterly at variance with our received notion of 
causality, many authors believe that closed causal curves are unphysical, and 
that a solution to the Einstein field equations is admissible only if it satisfies the 
so-called causality condition (p. 337, note 23) i.e. if the spacetime defined by it is 
truly a causal space. Stephen Hawking (1971) equates “ordinary causality” 
with “the absence of closed timelike curves”. He observes that, 


if there were such curves, one could in theory travel round them and arrive in one’s past. The 
logical difficulties that could arise from such time travel are fairly obvious: for example, one 
might kill one of one’s ancestors. These difficulties could be avoided only by an 
abandonment of the idea of free-will: by saying that one was not free to behave in an 
arbitrary fashion if one travelled into the past.'> 


However, as Hawking and Ellis (1973) remark, free will “is not something 
which can be dropped lightly since the whole of our philosophy of science is 
based on the assumption that one is free to perform any experiment”.'® By the 
same token, a physically reasonable relativistic spacetime should not contain 
any almost closed causal curve, i.e. a causal curve which, even though it does not 
intersect itself, enters more than once in the same infinitesimal neigh- 
bourhood—in other words, it must be a strong causal space (p. 125 and 
p. 308, note 6). It is not hard to see that in a strong causal space, every point of 
which has a neighbourhood that is causally isomorphic, by the local 
exponential mapping, with an open submanifold of Minkowski spacetime, the 
set ||(P) of points separate from any given point P cannot be empty, and 
indeed intersects every neighbourhood of P. This ratifies our earlier statement 
that || (P) can be empty for some worldpoint P only in a spacetime in which 
one cannot define a time, and causality has unusual features. As we now see, 
such spacetimes are not allowed as physically viable by two major authorities 
in the field. Hawking and Ellis go in fact even further. A strong causal spacetime 
might be, as they say, “on the verge of violating” the foregoing requirements, if 
a small variation in the metric suffices to generate closed timelike curves. “Such 
a situation would not seem to be physically realistic since General Relativity is 
presumably the classical limit of some, as yet unknown, quantum theory of 
space-time and in such a theory the Uncertainty Principle would prevent the 
metric from having an exact value at every point. Thus in order to be physically 
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significant, a property of space-time ought to have some form of stability, that 
is to say, it should also be a property of ‘nearby’ spacetimes.”!’ To make this 
demand precise, the authors introduce a topology in the set of Lorentz metrics 
admissible by a given 4-manifold (I have defined it on p. 337, note 17), and go 
on to postulate that a relativistic spacetime (W, g) will be deemed to be 
physically reasonable if and only if it is stably causal, i.e. if and only if the 
metric g has a neighbourhood U in the said topological space of Lorentz 
metrics on W, such that for any g’ € U, the spacetime (W, g’) does not contain a 
closed timelike curve. As I mentioned on page 228, Hawking (1969) showed 
that stable causality is both necessary and sufficient for the existence ofa global 
time coordinate function. Thus, the same requirement that apparently must be 
met by any causal structure compatible with the practice of experimental 
science, ensures the possibility of partitioning events—in many different, 
generally arbitrary ways—into linearly ordered simultaneity classes; a poss- 
ibility that is essential to our received idea of physical time and in fact 
equivalent to stable causality. However, not all researchers subscribe to 
Hawking and Ellis’ outright rejection of closed causal curves. Thus, Geroch 
and Horowitz (1979) recall that “physical theories frequently suggest new and 
unexpected phenomena, which are later found to be realized physically”, and 
propose that one should keep an open mind on this matter.‘® 

Another feature of relativistic spacetimes which would seem to defy our 
received view of causality—specifically, of the role of causal interaction in the 
constitution of the universe—are the so-called event and particle horizons. 
Following Rindler (1956), we shall define these concepts for a relativistic 
spacetime (W, g) in which the worldlines of massive particles (resp. the integral 
curves of the worldvelocity field of matter) constitute a congruence. Such a 
congruence can be made in an obvious way into a 3-manifold M. Consider a 
worldpoint P in the worldline of a particle z. We say that a worldline y € M lies 
beyond P’s horizon (or beyond z’s horizon at P )if no point in the range of y lies 
in the causal past of P. The particle horizon of P (or of z at P) is the boundary 
(in M) of the set of worldlines that lie beyond P’s horizon.'® Unless the particle 
horizon of P is empty, there are particles in the world W—namely, those 
beyond P’s horizon—which have never exercised any influence on z at the time 
of P. Wealso say that a worldpoint P €W (or an event at sucha worldpoint) lies 
beyond the horizon of a particle z if no point in the worldline of 7 lies in the 
causal future of P. The event horizon of a particle x (or of its respective 
worldline) is the boundary (in W) of the set of events that lie beyond z’s 
horizon. The reader should satisfy himself that the event horizon of z is none 
other than the boundary of z’s causal past (p. 121). Unless the event horizon of 
nm is empty, there are events in the world W—namely, those beyond z’s 
horizon—which will never have any influence on z. Rindler’s concepts of 
particle and event horizons are nicely symmetric: The particle horizon of an 
event P is the boundary of the set of particles whose worldlines do not intersect 
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the causal past of P. The event horizon ofa particle z is the boundary of the set 
of events whose causal futures do not intersect the worldline of z.2° Non- 
empty particle and event horizons are nota rarity in relativistic spacetimes. An 
event horizon envelops each “black hole”, that is, each spacetime region where 
the gravitational field is so strong that a signal originating within it is forever 
trapped inside it.?! Particle horizons occur in all the likeliest relativistic world 
models.?? Thus, they exist in every non-empty expanding Friedmann world 
with initial singularity, and also in the anisotropic generalizations of the 
Friedmann solutions studied by Taub (1951). Let M be the 3-parameter family 
of matter worldlines of such a Big Bang world and pick any y € M. The particle 
horizon of any P ey divides M into two components, one of which contracts to 
y as P approaches the initial singularity. This means that each speck of matter 
is “born” at the Big Bang in total isolation from all the rest, and only little by 
little increases its “circle of acquaintances”, i.e. the set of particles on which it 
has acted and which have acted on it. If, as seems likely, we live in a universe 
roughly representable by a Friedmann or a modified Friedmann Big Bang 
model, then, at this very instant, new galaxies are entering our particle horizon 
from opposite parts of the sky, which hitherto have not had the chance of 
acting on us or upon each other. Since the phenomena that first disclose to us 
such emergent galaxies must inevitably be read as instances of the same natural 
laws that govern all things known hitherto, it is clear that we must regard all 
matter, manifest or forthcoming, as constituting a single world not held 
together by the bonds of causal interaction. The existence of non-empty 
particle and event horizons therefore means that, contrary to our inherited 
views, causality is not the “cement” of a typical relativistic universe. In itself, 
this ts not surprising, for the very possibility of causation obviously 
presupposes a system of laws under which it can take place. Things do not have 
to rub themselves against each other in order to become parts of a common 
nature. They must rather be, of themselves, sufficiently alike in order to sense 
their mutual presence. In General Relativity, as in any rational system of the 
world, the formal causality of structure sets the stage for the efficient causality 
of things. Horizons merely make more evident, in the worlds where they occur, 
this anyhow unquestionable order of ontological precedence. I must add, 
however, that although there is nothing really disquieting about horizons as 
such, the particle horizon of the Big Bang models is a source of paradox under 
the particular circumstances of our world. The remarkable isotropy of the 
microwave background radiation discovered by Penzias and Wilson signifies 
that widely distant parts of the universe were in virtually the same thermal state 
when that radiation was emitted. If the available data are interpreted in the 
light of General Relativity it turns out that at the time of emission some of the 
sources must have lain beyond each other’s horizons. The thermal uniformity 
of causally unconnected world regions is probably too unlikely a coincidence 
to be simply taken for granted. C. W. Misner, who has long been worried by 
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this problem,”* now believes that its solution will involve modifications of the 
Einstein field equations, or at least their quantization.?* Some such change 
should anyway be necessary in order to reconcile Einstein’s theory of gravity 
with the accepted theories of the other fundamental forces of nature. An 
examination of what is being done with this purpose lies, however, beyond my 
scope. 


Appendix 


A Differentiable Manifolds 


The theory of differentiable manifolds was founded by Bernhard Riemann 
in 1854 as a generalization of Carl Friedrich Gauss’ treatment of smooth 
surfaces in Euclidean space.! Such surfaces can be metrically and even 
topologically quite different from the Euclidean plane, yet they can always be 
represented “patchwise” ona region of the latter, as the earth is charted on the 
pages of an atlas. This method of representation enabled Gauss to define and 
study the properties of surfaces “intrinsically”; i.e. without regard to the 
manner now they lie in space. The theory brought to life by Riemann grew 
richer and clearer in the century following his death, especially after Einstein 
used it in General Relativity. The following definitions and remarks agree with 
the now standard approach to the subject. 

A topological space S is an n-dimensional manifold if every point of S has a 
neighbourhood homeomorphic with R”. A chart of S isa homeomorphism of 
an open set of S onto an open set of R".? We designate a typical chart by x, its 
domain by U,, or dom x. x assigns to each point Pe U,.a list of n real numbers 
x'(P),..., x"(P), called the coordinates of P by x. The mapping x': U, > R; 
P.4x'(P) is the i-th coordinate function of x (1 Si<n). If U, =S,x isa 
global chart. 

Let x and y be two charts of S. The composite mapping x-y~" is a 
homeomorphism of one region of R” onto another, namely, of y(U, © U,) onto 
x(U, AU,). x:y~' and y-x~! are the coordinate transformations between x 
and y. x and y are C* -compatible if both coordinate transformations between 
them are differentiable mappings of class C* (k = 1). A collection A of C*- 
compatible charts of S, such that each point Pe S lies on the domain of some 
chart in A, is a C*-atlas of S. (Observe that A may consist of a single chart x, 
provided that x is global.) Given a C* -atlas A of B, there is a unique maximal C* 
-atlas A,,,, of S, comprising every chart of S that is C * compatible with the 
charts in A. (Exercise: Show that any two charts in Ama, are C“-compatible.) 
A non-empty subset of Ama, which is a C*-atlas of S in its own right is called a 
C"-subatlas (h = k). 

Let S be an n-dimensional manifold. Let A be a C*-atlas of S. The pair 
(S, Amax) is a real n-dimensional C*-differentiable manifold? (S, Ajax) i8 said to 
be orientable if Ama, contains a C*-subatlas A*, such that the Jacobian 
determinant of any coordinate transformation between two charts of A* is 
positive at every point of its domain. The atlas A * is said to be oriented and to 
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bestow an orientation on the manifold. In this book, spacetime manifolds are 
generally assumed to be orientable. To avoid annoying exceptions we assume 
also that every manifold mentioned has the Hausdorff topology, ie., that 
any two points of it lie in mutually disjoint open sets. A 
real n-dimensional C”-differentiable manifold with a Hausdorff topology 
will be called an n-manifold. Note, in particular, that R" is an n-manifold (with 
the atlas comprising the identity mapping that sends each list of n real numbers 
to itself). 

Until the end of this Section, S will denote an n-manifold and S’ an m- 
manifold. If U is open in S, U is an n-manifold with the atlas consisting of the 
restrictions to U of the charts in the maximal atlas of S. U, with this atlas, is said 
to be an open submanifold of S. The Cartesian product S x S‘ is made into an 
(n+m)-manifold by the stipulation that, if x and y belong respectively to the 
maximal atlases of S and S’, the mapping that sends each pair (P,Q)eS x S'to 
(x(P), y(Q))€R"*" is a chart of S x S’. S x S’, with the atlas consisting of all 
such charts, is called the product manifold of S and S’. 

A mapping fof S into S’ is said to be C’-differentiable at Pe S (r < oo) if, for 
some chart x of S defined at P and some chart y of S’ defined at f(P), the 
composite mapping y-f-x' is differentiable of class C’ at x(P). (Show that 
this definition does not depend on the choice of x and y.) fis C’-differentiable if 
it is C’-differentiable at every point of S. If n=m and f is a bijective 
C-differentiable mapping with C®-differentiable inverse f~', f is a dif- 
feomorphism and S and S' are said to be diffeomorphic. (A diffeomorphism is an 
isomorphism of differentiable manifolds.) In this book we need not worry 
about orders of differentiability. I speak therefore loosely of smooth mappings. 
By smooth I mean C*-differentiable for some definite integer k, no less than 1 
and as large as is required for the purpose at hand.* 

If x is a chart of S and fis a smooth mapping of S into R, the composite 
mapping f-x~' is a smooth mapping of R" into R. We write f; for the i-th 


partial derivative of f-x~1,° and set 


of /dx! = f;°x (A.1) 


A smooth mapping of S into R is called a scalar field on S. The collection 
F (S) of all scalar fields on S is made into an algebra over R (i.e. a real vector 
space that is also a ring, in such way that addition in the ring agrees with vector 
addition). Let P be any point of S and let fp denote the value of the scalar field 
fat P. Vector addition (i), scalar multiplication (ii) and ring multiplication 
(iii) are defined as follows, for all f ge F (S) and all aeR: 


(i) (f+ 9)p =fe+9P 


(ii) (af) p = ofp 
(iii) Ug)p=fr gp 


APPENDIX 259 


The collection ¥(P) of all scalar fields defined on some neighbourhood of 
P eS is also an algebra over R, with the operations defined as above, for any 
pair of elements of ¥(P), in the intersection of their domains. 

A continuous mapping y of an interval I < R into S is called a curve in S. 
(The variable argument of y is usually called the curve’s parameter; if f is the 
restriction to I of ahomeomorphism of R onto itself, the curve y’ = y-f~ isa 
reparametrization of y.) The range of a curve y is sometimes called a path. In this 
book, unless otherwise stated, we assume that a curve y in an n-manifold S is 
always smooth or at least piecewise smooth (i.e. smooth everywhere except at a 
finite number of points). If its domain J is not open, y must be the restriction to 
I of a smooth curve defined on an open interval—this condition ensures the 
smoothness of y at the endpoints of J. Let P = y(u) for some u € I. The tangent 
to y at P, denoted by 7p, is the linear mapping of F (P) into R defined, for each 
fEeF(P) by: 


df-y 


ari (A.2) 


yp(S) = 


Instead of yip(f) we shall write ypf. (Observe that the symbol ‘jp’ is ambiguous 
if P = y(u) for more than one value of the parameter u; see page 261.) The set 
of all tangents at P to curves in S is a subset of the vector space (F (P))* of 
linear functions on ¥(P). In fact, as we shall see, it is an n-dimensional 
subspace of (F (P))* 

Let x be a chart of S defined at P. Each coordinate function x'(1 <i <n) 
determines a curve y’‘, which we shall call the i-th parametric line of x through P. 
y' is defined as follows: for each suitable real number ¢, y‘(t) is the point Q in 
dom x such that x'(Q) = tand, for j # i, x/(Q) = x/(P). (Observe that y' maps 
a suitable interval in the range of x' onto the set of points in S that share with P 
all x-coordinates except the i-th.) Set x'(P) = u. If fe F¥(P), 


, d(f-y; 1 . 
Vet = ae = lim Ff uh) fu) 
t |=u acoh 
= lim L pexc! (x! (P),..., xi(P)+h,..., x"(P)) 
Aoon 
—fox7! (x"(P),..., x! (P), ... x" (P)) 
_of 
= 5x, (A.3) 


In agreement with this result, we shall hereafter designate the tangent at P to 
the i-th parametric line through P of a chart x by 6/0x'|p. The collection 
(0/0x"|p, .. ., 6/0x"|p) is a free family of vectors in (F (P))*.° Hence it spans 
an n-dimensional subspace of (¥ (P))*. We shall now prove that this subspace 
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is identical with the set of tangents at P to curves in S. In anticipation of this 
result we call it the tangent space of S at P. We shall denote it by Sp. We show 
first that, if yp is the tangent at P toacurve y, ‘pis equal to a linear combination 
of the 0/0x'|p. If P = y(u), fe F (P)and f; is the i-th partial derivative of f- x7’, 


a) _ yy sve) Sie! 


We now show that if ve Sp there is a curve y in S such that y(0) = P and 
}p = v. By hypothesis, v = X,v'd/dx'|, for some set of n real numbers v’. Set 
u; = uv'+x'(P). The curve y:urox ! (u,,..., u,) is then defined in some 
interval containing 0, and it is clear that y(0) = P. We have that 


xi _ d(x" y) 
Ye di 


Hence, by (A.4), yp = 2; (yp x')d/dx'| p = Xjv'd/0x'|p = v. QED. 

An element of Sp is called a vector at P. A rule V that assigns to each point P 
in S a vector Vp at P is called a vector field on S if it fulfils the following 
requirement of smoothness: for each fe ¥ (S), the mapping Vf: P+ Vpf is a 
scalar field on S.’ Evidently, the correspondence 0/0x': P1+0/dx'|, is a 
vector field on the open submanifold dom x. In particular, the identity 
mapping t+>¢ on R determines a vector field 0/ét. If V is a vector field on S,a 
curve y on S, such that, for each point P in its range, jp = Vp, is an integral 
curve of V. For each P in S there is an integral curve y of V such that (0) = P 
and every other integral curve of V that meets this requirement is the 
restriction of y to a subset of its domain; we call this curve, the maximal integral 
curve of V through P. The vector field V is said to be complete if all its maximal 
integral curves are defined on R. The collection ¥"'(S) of all vector fields on S 
is made into a module over the ring F (S) of all scalar fields on S. For every V, 
Wev !(S) and every fcF(S), the vector sum of V and W is the vector field 
(V + W) which assigns to each Pe S the vector V p+ W>p; and the product of V 
by f is the vector field {V which assigns to each PeS the vector fpV p. 

The dual space of Sp—.e. the vector space of real-valued linear functions on 
Sp—is called the cotangent space of S at P and will be denoted by S#. Its 
elements are called covectors at P. A covector field on S is defined like a vector 
field. The collection ¥, (S) of covector fields on S is a module over F (S), with 
operations analogous to those on ¥'(S). If U is an open submanifold of S, 
each smooth real-valued function fe ¥ (U) determines a covector field dfon U, 
called the differential of f; dfassigns to each P € U the covector dfp, defined, for 
all ve Sp, by the relation 


tpf= 


(A.4) 


x(P) 


_ xi(P)+hv' —x!(P) 
= lim 


0. ho h 


=v! (A.5) 


dfp(v) = of (A.6) 


The differentials dx! of the coordinate functions x! of a chart x define, at each 
Pedom x, a basis (dxp) of the cotangent space SB. As dx (0/dx!|p) 
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= 6x'/6x!|p = 6; ;, (dx) is none other than the dual basis to (0/0x'|p).® 

The concept of a differential can be extended to arbitrary differentiable 
mappings of manifolds. Let f:S — S’ be smooth. The differential of f, denoted 
by f,, assigns to each PeS the linear mapping f, p of Sp into S’,p), which is 
defined as follows: Each vector v € Sp is the tangent at P toacurve y in S; we set 
J, p(v) equal to the tangent at f(P) of the curve f: y in S’. (The reader ought to 
satisfy himself that this definition does not depend on the choice of a particular 
curve y among the many to which v is tangent.) We have that, for each scalar 
field g defined on a neighbourhood of f(P), 


Sxp(v)g = v0(9f) (A.7) 


Note that if S’'=R, f, =df. If y is a curve in S such that y(u) = P, 
Yeu0/Ot = yp. With some impropriety, I often write y,, for jp. (This notation 
removes the ambiguity inherent in “yp” if P = y(u) for more than one value of 
the parameter u.) By the rank of f:S — S’ at a point Pe S we mean the rank of 
f, p» Le. the dimension of the vector space f, p(Sp) < S'jp). 

A smooth mapping fof the n-manifold S into the m-manifold S’ is said to be 
regular at Pe Sif its rank at Pisn. (This presupposes that n < m.)If S c S’and 
the canonic inclusion i, that sends each point of S to itself regarded as a point of 
S’, is everywhere regular, S is said to be a submanifold of S’. (Open sub- 
manifolds, as defined on page 258, are certainly submanifolds in the present 
sense.) It is not difficult to prove that each point PeS lies in the domain of 
some chart of S’ which maps a neighbourhood of P in S into the flat 
{(r1,5. Pd eee =Tfy42 = +--+ =l_ =O} CR”. The collection of all such 
charts (for every P € S)is readily made into an atlas for S by restricting them to 
S and ignoring the last (m—n) coordinate functions of each, which by 
definition are identically zero. With this “induced” structure, S is an n- 
manifold. Let P be a point of S. A vector veS> is said to be tangent to the 
submanifold S at P if v lies in the range of i,p». An (n—1)-dimensional 
submanifold of an n-manifold is called a hypersurface. Hypersurfaces are 
usually called surfaces if n = 3, paths if n = 2. 

As a useful application of the foregoing notions, let us now define the 
concept of the Lie derivative of a vector field along another. We consider a 
vector field V on an n-manifold S. Let y? denote the maximal integral curve of 
V through a point PeS. The reader should satisfy himself that if V is a 
complete vector field, the mapping F, that sends each Pe S to y?(t) (that is, to 
the point corresponding to the value ¢ of the parameter of the maximal integral 
curve of V through P) is defined for every te R and is in effect a bijective 
mapping of S onto itself. If V is not complete there is at any rate for every Pe S 
an open interval Ip < R, containing 0, such that, for each te Ip, the mapping 
F,:Q1++y2(t) is defined on a neighbourhood N , of P, which it maps bijectively 
onto a neighbourhood of y?(t). Let W be another vector field on S. If PES, Wp 
is an element of the tangent space Sp. Let us write yt for y?(t). Obviously, if | t| 
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is sufficiently small, yt exists and F_, is defined on a neighbourhood of it and 
sends yt to yO = y?(0) = P. Hence, the differential F_,,,, of F_,at yt maps the 
tangent space S,, onto Sp. Thus F_,,,,(W,,) is an element of Sp which can be 
added to W, or to its inverse — Wp. The Lie derivative L ,,W of the vector field 
W along the vector field V at a point P of S is defined by: 


ae | 
Ly Wp = lim © (F -sgye( Wy.) ~Wo) (A8) 


The concept of a Lie derivative along a given vector field can be extended ina 
natural way to scalar fields and to tensor fields of all orders (on tensor fields, 
see below and page 99). It turns out that the Lie derivative L , fof a scalar field 
falong the vector field V is none other than Vf, the “directional derivative” of f 
along V. It is clear from equation (A.8) that Ly W isa vector field on S. It can be 
shown that, if fis a scalar field on S, Ly, W(f) = V(WS) — W(V/f). If Ly W = 0 
we say that W is Lie dragged by V. (This terminology is also used if W isa scalar 
or tensor field). Clearly, the vector fields 0/@x'(1 < i < n) determined by any 
chart x of S are Lie dragged by one another (for 67/0x!'dx/ = 67 /Ax/dx'). 
Suppose now that (V',..., V") are n vector fields on an open submanifold 
U cS, such that at each PeU, (V}, .. ., Vb) is a basis of Up. (V',.. ., V")is 
an n-ad field on U. It is not hard to see that if the V' are mutually Lie dragged 
(ie. if L,:V/ vanishes everywhere for all values of the indices i and j) their 
integral curves through any given Pe U are the parametric lines of a chart of S, 
defined on U and mapping P on 0. An n-ad field consisting of n mutually Lie 
dragged vector fields is said to be holonomic. 

The vector spaces Spand S}, attached to any point P of ann-manifold S, give 
rise to an infinite array of vector spaces, namely, the spaces of multilinear 
functions on suitable Cartesian products constructed from Spand S}. (E.g., of 
bilinear functions on Sp x Sp, on Sp x Sp, on Sp x Spand on Sp x Sf, etc.) The 
elements of such spaces are called tensors at P. I deal with them in Section 4.3, 
where I define the concept of a tensor field on S. The crucial and indeed the 
hardest step on our way to this concept is the definition of the tangent space Sp. 
This can be given in several different ways, besides the one proposed above.” 
The intuitive prototype of Sp is of course the tangent plane at a point of an 
ordinary smooth surface—which, if the latter is flat, may be said to coincide 
with it, but otherwise extends on its own into the surrounding space. However, 
neither our definition nor its surrogates rely on the existence of a manifold of 
more dimensions than S, in which S is embedded and Sp may spread. This is of 
little consequence in pure geometry, for every n-manifold is diffeomorphic 
with a submanifold ofa 2n-dimensional Euclidean space (Whitney, 1944a). But 
it is of the utmost importance in natural philosophy, for it enables one to make 
sense of the idea of a cosmic manifold, that is not coincident and possibly not 
even diffeomorphic with its tangent spaces, and yet is not embedded in any 
physical manifold of higher dimension. 


APPENDIX 263 
B_ Fibre Bundles 


A bundle E over B consists of two topological spaces E and B, and a 
continuous mapping z of E onto B. B is the base of the bundle. z is called the 
projection. A cross-section of E is any continuous mapping f of B into E, such 
that x-fis the identity on B. (For each xe B, f picks out an element f(x) of 
the fibre of x over x.) 

In this book we are interested only in differentiable bundles. We assume that 
Eand Bare, respectively, an n-manifold and an m-manifold (n = m), and that x 
is smooth. We also expect cross-sections to be differentiable to a sufficiently 
high order, which, however, may be less than the order of differentiability of z. 

Let E bea differentiable bundle over B, with projection z, and let G be a Lie 
group acting on E on the right.' E is a principal fibre bundle over B with 
structure group G if the following conditions are fulfilled: 


(i) The action of G on E is free, i.e., for any aé E and any g€G, ag = aif 
and only if g is the neutral element of G. 

(ii) The action of G is transitive on each fibre of 2 and the partition of 
E into fibres by 2 agrees with its partition into orbits by the action 
of G.? 

(iii) Each ue B has a neighbourhood U such that 2~'(U) is mapped 
diffeomorphically onto UxG by ar+(x(a), f(a)), where f is a 
mapping such that, for every ae U, geG, f(ag) =f(a)g. 


Condition (iii) implies that it is always possible to find an open covering 
{U,},e1 of B, such that, for each ae 1, z~'(U,) is mapped diffeomorphically 
onto U,x G by a + (x(a), f,(a)), where f, satisfies the relation f, (ag) = f, (a)g 
for every aeU, and every geG. If aen'(U,AU,), flag (lag)? 
= fy (a)(,(a)) 1 so that the latter product depends only on x(a), not on a. We 
can therefore define a mapping f,,: U, © Ug > G, by fa(™(a)) =fy (a) f,(a))*. 
The mappings f,,, for all a, Be J, are the transition functions of the principal 
fibre bundle (E, B, x, G) corresponding to the open covering { U,},<, of B. It is 
clear that if a lies in the intersection of U,, U, and U,, 


fya(@) = Sop (@) fea (a) (B.1) 


On the other hand, given a Lie group G, an n-manifold B with an open covering 
{U,},¢, and a collection of mappings f,,: U, U, > G (for each non-empty 
intersection U, > Ug) that satisfy (B.1), there exists a principal fibre bundle E 
over B with structure group G and transition functions f, ,.° 

Let E be a principal fibre bundle over B with projection xz and structure 
group G. Let G act effectively on the left on a differentiable manifold F. The 
mapping of G x (E x F) into (E x F) by (g, (a, s)) + (ag,g~'s) is plainly an 
action of Gon E x F, on the right.* Two pairs (a, s), (a’,s’)€ E x F belong to the 
same orbit of this action if there is a géG such that (a’,s’) = (ag,g~‘s). Such 
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orbits are obviously disjoint by pairs. Let E’ be the set of all orbits of the action 
of Gon E x F. We denote by [a, s] the element of E’ that contains the pair (a, s). 
By condition (ii) on p. 263, (a) = x(ag) for every ae E and geéG. We can 
therefore define a mapping 7’ of E’ onto B by setting x’ ([a,s]) = x(a). By 
condition (iii), for each ue B we can find a neighbourhood U of u and a 
diffeomorphism of 2~'(U) onto U x G. We shall now define a bijection f of 
n'~1(U) onto U x F such that the fibre of 2’ over each ve U is mapped onto 
{v} x F. In the fibre of z over each ve U there is a unique element a which the 
chosen diffeomorphism of 1~1(U) onto U x G maps on (x(a),e), where e 
denotes the neutral element of G. Let us set f([a,s]) = (2(a),s). claim that we 
have thereby defined the sought-for ee To prove it, 1 observe that for 
each [ a,s]ez’~' (U) there isa pair (a’, s’)€ [a,s], such that a’ is mapped by the 
chosen diffeomorphism on (z(a’), a Our stipulation therefore defines a 
mapping of n’~1(U) into U x F. The mapping is plainly surjective and sends 
n'~*(v) to {vl x F (for each véU ). To show that it is also injective assume 
that [a’,s’] # [as]. If x(a’) 4 x(a), obviously f([a’, s']) # f({a,s]). On 
the other hand, if z(a’) = x(a) there is a geG such that a’ = ag. Hence 
[a’,s’] =[a,gs']. But gs’ is surely different from s, or else s’=g™'s 
and [a’,s’] = ee ~!s] = [a,s], contrary to our assumption. Consequently, 
f(a’, s’]) = (x(a), gs’) # (n(a),s) =f([a,s]). Since U x F is an open sub- 
manifold of i product Caan Bx F, the charts of the latter can be 
combined with the bijections we have just defined (for each u € B) to bestow a 
differentiable structure on E’, whereby 7’ is a smooth mapping. E’ is then a 
differentiable bundle over B with projection 1’. We call it the fibre bundle with 
typical fibre F associated with the principal fibre bundle (E, B, x, G). 

Let M bean n-manifold. We shall define the principal bundle of n-ads over M. 
An n-ad at Pe M is an ordered basis of the tangent space M,. Let FM denote 
the aggregate of all n-ads at every point of M. The mapping x: FM > M 
assigns to each element of FM the point of M at which it is an n-ad. We shail 
make FM into an (n + n)-manifold. For each chart x in the maximal atlas of M 
we define a chart X on x~'(dom x) by the following stipulation: If F 


={F,,...,F,} is an n-ad at Pedom x, the coordinate functions 
x!,..., %” +" of X satisfy at F the equations: 
x\(F)=xi(P) x +(F) = F,(x!) (B.2) 


(l<ij<n) 


FM is a manifold with the maximal atlas determined by all such charts x. On 
this manifold, x is certainly smooth. We shall now define an action of the 
general linear group GL (n, R), on the right, on FM. Let A = (Aj) be a non- 
singular n x n real-valued matrix and let F = (F,,..., F,) beann-adat PEM. 
We set FA = (AjF,, .. .. AJ F; where summation is understood, as usual, 
over repeated indices. F A is obviously an n-ad at P. The action thus defined 
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evidently meets conditions (i) and (ii) on p. 263. The diffeomorphisms required 
by condition (iil) can be readily found. For each P € M there isa chart x defined 
on a neighbourhood U of P. For each QeU, (€/@x'|g) is an n-ad at Q. We 
define a mapping f of 1~'(U) onto GL(n,R) by stipulating (i) that f sends 
(G/éx'|g) to the identity matrix, and that (ii) for any Fex~'(U) and any 
AéeGL(n,R), f(FA)=f(F)A. The mapping Fro (x(F),f(F)) is then a 
diffeomorphism of x~'(U) onto U x G that meets the prescribed require- 
ments. (FM, M,2,G L(n, R)) is the principal bundle of n-ads over M. We 
denote it hereafter by F M. A cross-section of FM is called an n-ad field on M. 
If such an n-ad field exists, it defines canonic tsomorphisms between every pair 
of tangent spaces of M. Two vectors at different points of M are the canonic 
images of each other if they are the same linear combination of the local value 
of the n-ad field at their respective points. Two vectors are said to be parallel if 
they are thus the canonic image of each other. Parallelism is obviously an 
equivalence. M is said to be parallelizable if FM admits a cross-section. 
Parallelizable n-manifolds are the exception, not the rule. However, the 
domain of any chart of an n-manifold M is always itself a parallelizable 
submanifold (p. 93). 

As is well-known, GL(n, R) acts transitively on the left on R" as a group of 
linear transformations. We can therefore define, according to the procedure 
sketched above, the fibre bundle with typical fibre R" associated with the 
principal bundle F M. We denote it by TM. Since usually there is little danger 
of confusion, we shall denote the projection on 7M onto M by z. A typical 
element of TM isan equivalence class. [F,r], where F = (F,,..., F,)isan-ad 
at x(F)eM and r=(r',...,r") € R". Since for each Ac GL(n,R),[F,r] 
= [FA,A™~'r],[F,r] is uniquely associated and may indeed be identified with 
the vector r'F, at x(F). Through this identification the fibre of 2 over each 
PeéM turns out to be the tangent space Mp. The underlying set of TM is the 
union of all such tangent spaces. 7M is called the tangent bundle of M. A cross- 
section of 7M is a vector field on M (cf. p. 260). 

Let t4 be the linear space of (q, r)-tensors on R". GL(n, R) acts transitively on 
the left on t¢in a natural way.° We may therefore define the fibre bundle 77M 
with typical fibre t? associated with the principal bundle F M. T?M can be 
identified with the union of the spaces of (q, r)-tensors at all pointsof M (p.99) 
and is called the (qg,r)-tensor bundle of M. A cross-section of this bundle is a 
(q,r)-tensor field on M. 


C_ Linear Connections 


(Unless otherwise noted, summation is implied over repeated indices.) 

We have seen above that vectors at different points of a parallelizable 
manifold can be compared with one another and partitioned in a natural way 
into equivalence classes of parallel vectors. In a non-parallelizable manifold— 
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e.g. the surface of a sphere---parallelism between distant vectors cannot be 
defined thus absolutely, but one can usually introduce a concept of relative, 
path-dependent parallelism which enables one to compare vectors at different 
points with respect to any particular path joining those points. As we shall see 
below (p. 273), this notion of relative parallelism can be defined in an 
n-manifold M if and only if the principal bundle of n-ads over M is 
parallelizable, i.e., if and only if a concept of absolute parallelism is applicable 
to F M. Such is indeed the case if M is topologically paracompact, i.e. if every 
open covering of M has a locally finite open refinement. 

Let S be a smooth surface in Euclidean 3-space. Consider two vectors v and 
w at different points P and Q in S. If S is a plane we naturally say that w is 
parallel to v if w = t, p(v), where t is the translation that sends P to Q. If Sisa 
developable surface, such as a cylinder or a cone, we say that v and w are 
parallel if f, p(v) is parallel to f, 9(w), where fis an isometry of a suitable region 
of S into a plane. If S is not developable we can still make sense of the idea of 
parallelism between v and w, relatively to a path joining P and Q, as the 
following example will show. Let S be a sphere and y a curve in S from P to Q. 

Let a plane touch S at P and allow S to roll, without slipping, in such a way 
that the plane always touches a point on the range of y until it reaches Q. The 
vectors v and wcan be naturally identified with tangent arrows emerging from 
the sphere at P and Q. At the beginning of the motion, v coincides with some 
directed segment v’ on the plane; at the end, w coincides with another such 
segment w’. We say that v and ware parallel along the curve y if and only if v’ is 
parallel to w’. Note that this concept of parallelism depends essentially on the 
range of y, not ona particular parametrization. Note further that, if P = Q and 
y 1s a closed curve, v need not be parallel to itself along it, indeed v is parallel, 
along diverse loops, to different vectors at P. In particular, if v is tangent to y, v 
is self-parallel along y if and only if y is a great circle. Indeed, if y isa great circle, 
parametrized by arc length, all its tangent vectors are parallel to one another 
along it. In this sense, great circles have constant directions and are the 
straightest lines on a circle. 

The notion of path-dependent parallelism is extended to a vast class of 
manifolds by means of the concept of a linear connection. This concept has 
been defined in several equivalent ways. The definition we shall propose below 
was given by Ch. Ehresmann (1957), but is ultimately based on the method and 
ideas of Elie Cartan. Other definitions are easier to spell out, but this one is, in 
my opinion, the most intuitive and illuminating.' We develop first some 
notions that will be required for it. 


1. Vector-valued Differential Forms 


Throughout this paragraph, V denotes an m-dimensional real vector space, 
endowed with the natural differentiable structure defined by its isomorphisms 
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with R™. We agree to call a list “repetitious” if it contains the same object more 
than once. Let W be an n-dimensional real vector space. A p-linear mapping 
f:W°— Vis said to be alternating if every repetitious list in W” lies in the 
kernel of f.? (Note that according to this definition all 1-linear mappings of W 
into V are alternating, for there are no repetitious lists in their domain.) Let S 
be an n-manifold. We agree to call any smooth mapping of S into Va V-valued 
0-form on S. A V-valued p-form w on S(p 2 1) is a correspondence that assigns 
to each PeS an alternating p-linear mapping wp of Sp into V, and varies 
smoothly with P. We observe at once that w determines a mapping—also 
called w—of the p-th Cartesian power of the module ¥ !(S) of vector fields on 
S into the set of V-valued 0-forms on S; namely, if X',..., X?e~ '(S), 
w(X',..., X”) is the 0-form P++w,(X},...,X). We also note that if 
p > n,wisidentically 0.° Lete,,;,___;,beequalto 1 orto —lif(i,,i,,.. ., i,)is, 
respectively, an even or an odd permutation of (1, 2, . . ., p),and let it otherwise 
be 0. If @, isa V-valued p-form and w, isa V-valued q-form on S, their exterior 
product w, A @, is the (p+ q)-form whose effect on a list of p + q vector fields 
X'ew'!(S) is given by: 


: é 
(p q)! iyi... dpe 
X W(X, 2. Xie) (C11) 


(@, A w,)(X',..., X?*) = w,(X4,..., X*) 


(where the indices i, range over {1,.. ., p+q}). The operation thus defined is 
evidently associative. It therefore makes sense to speak of the exterior product 
@,A@,A...A@,, of any number of V-valued forms on S. 

The p-forms on an n-manifold S defined on page 99 are of course R-valued 
p-forms. On the domain of a chart x any such R-valued p-form q@ is equal to a 
linear combination of all the different p-forms that can be constructed by 
taking the exterior product of p differentials of the coordinate functions. 
Symbolically: 


o= Yo faiz..i, dx Adx® Ar... Adx' (C.1.2) 


ip sips... <i, 


There is a unique linear operator d on the R-valued forms on S, which satisfies 
the following conditions: 


(i) If wis an R-valued p-form on S, d@ is an R-valued (p+ 1)-form on S$ 
(p 20); 
(ii) d(da@) = 0; 
(iii) if f is an R-valued 0-form (i.e. a scalar field) on S, df = dy, the 
ordinary differential of /; 
(iv) if w, is an R-valued p-form and w, an R-valued q-form on S, 
d(w, A @,) = dao, A @,+(-—1)?a, A day. 
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dw is called the exterior derivative of w. On the domain of a chart x, where w is 
given by (C.1.2) 


dw = y dfiis...i NAx "A dxX? A... A dx’ (C.1.3) 
i <i<... <i, 
If X,,..., X,4, are any vector fields on S, 
r+l 


(r+ 1)do(Xy,..4 Xa.) = ¥ (“Dit X,0(X,,.. Xi. s Xa) 


i=1 


Boe a Wad Ste, Cae, Cremer. Coeemres 


i<j 


(C.1.4) 
> X41) 


where Ly X; stands for the Lie derivative of X; along X;,* and the symbol X; 
indicates that X; is omitted.° In particular, if w is a 1-form and X and Y are two 
vector fields on S, 


2da@(X, Y) = Xw(Y) — Yw(X) — (Lx Y) (C.1.5) 


Let w be a V-valued p-form on S and let (e,,. . ., e,,) be a basis of V. There 
are then m R-valued p-forms on S, w',...,@", such that w = Z,e,w'. (In 
other words, if X',...,X? are vector fields on S, wp(Xp,..., X%) 
= L,e,0'p(Xp, . . ., X%), at each Pe S.) We define the exterior derivative dw of 
the V-valued p-form w as 


dw = £,e,;da! (C.1.6) 


2. The Lie Algebra of a Lie Group 


Let V bea vector space. A Lie bracket on Visa bilinear mapping V x V > V, 
by (u,v) [u,v], such that [u,v] = —[v,u], and [[u,v],w]+[[w,u],v] 
+[[v,w], u] = 0. The vector space V, endowed with a Lie bracket, is said to be 
a Lie algebra. Thus, for example, the Lie derivative of a vector field along 
another (p. 262) is a Lie bracket on the real vector space of vector fields over an 
n-manifold, and one often writes [X, Y] for LxY. 

Let G be an n-parameter Lie group. G acts on itself on the left transitively 
and effectively by (h,g)+ hg. Thus, each he G defines a diffeomorphism of G 
onto itself, the left translation by h, L,, which sends each géG to hg. A vector 
field X on Gis said to be left invariant if L,»X = X for every he G. The set of left 
invariant vector fields on G is a real vector space which we shall denote by gq. 
(The linearity of the differential entails that, if X and Y are left invariant fields 
on Gand geR,qX and X+ Y are left invariant fields as well.) g is in fact a Lie 
subalgebra of ¥ ' (G). (For the Lie derivative [X, Y] is left invariant if X and Y 
are left invariant.) g is called the Lie algebra of the Lie group G. 

We shall now show that g is linearly isomorphic to the tangent space G, (at 
the neutral element e€G), and is therefore n-dimensional. The mapping of 9 
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into G, by X +> X, is a linear injection. (If X, —Y, = 0 for X, Yeg, then, for 
any heG, X,—Y, = Laye(X.) — Lagel Ye.) = Liye(X. — Y,) =O—since L,,, is 
linear; consequently, X = Y.) If vis any vector at e, the vector field V, defined 
on all G by V, = L,,.(v)(hEG) is evidently left invariant and such that V, = v. 
The mapping X + X, is therefore surjective, and hence a linear isomorphism. 
It is indeed an isomorphism of Lie algebras, if the Lie bracket on G, is defined 
by [X,, Y.] = [X, Y],. It is often convenient to identify g and G, through this 
canonic isomorphism. 

Let (X',..., X") be a basis of g. The Lie bracket [X/, X*] of any two 
elements of this basis is equal to a linear combination 2,Ci, X‘. The numbers 

% are known as the structure constants of the Lie group G (relative to the 
basis (X‘) of g). 

If fis an automorphism of G (ie. a diffeomorphism of G onto itself that is at 
the same time a group homomorphism), f (e) = e, and f,, is an automorphism 
of G, (i.e. a linear isomorphism that preserves the Lie bracket). Via the canonic 
isomorphism described above, f determines a matching automorphism of the 
Lie algebra g. In particular, if fis the inner automorphism g ++ hgh™', induced 
by heG, the matching automorphism of g is denoted by ad(h). The mapping 
h + ad(h)isa representation of the Lie group G in the linear space g, known as 
the adjoint representation. If R, denotes the “right translation of G by h”—i.e. 
the permutation of G by g-+gh—we have that h~' gh = R,L,-1g. Hence, for 
every XEQg, ad(h"')X = R,,X. 


3. Connections in a Principal Fibre Bundle 


Let E be a differentiable principal fibre bundle over an n-manifold B, with 
projection x and structure group G. We shall denote the points of E by small 
italics and write mq for the value of zat qé E. As we know, G acts on E freely on 
the right (p. 263). Let R denote this action. For each geG there is a 
diffeomorphism R, of E onto itself, by q +> qg. We often write R,q for qg. If G 
is m-dimensional, E must be (n + m)-dimensional. The fibre of over any point 
of Bis diffeomorphic to G and is therefore an m-dimensional submanifold of E. 

Consider an arbitrary point qe E. The vectors at q tangent to the fibre of x 
over 7zq constitute an m-dimensional subspace V, of the (n + m)-dimensional 
tangent space E,. We call V, the space of vertical vectors, or the vertical 
subspace at q. Note that V, is the kernel of the differential x,, of zat q.° There is 
a natural isomorphism of V, onto the Lie algebra g of G. As G acts on E by R, 
each left invariant vector field X € g defines a vector field X on E. If y denotes 
the maximal integral curve of X through the neutral element e €G, the 
maximal integral curve of X through geéE is given by t++qy(t). Since the 
orbits of G’s right action are the fibres of z, the value X, of X at q is tangent to 
the fibre over mq and therefore lies in V,, The mapping ") + V, by XX, isa 
linear injection of an m-dimensional vector space into another and hence an 
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isomorphism. The inverse of this mapping is the natural isomorphism of V, 
into g. 

There are many n-dimensional subspaces of E, which, together with V,, span 
E,. Any such subspace of E,—i.e., any complement in E, of the vertical 
subspace at g— is called a space of horizontal vectors or a horizontal subspace at 
q. A connection in the principal fibre bundle E is a rule that picks out a 
horizontal subspace H, at each qeé E, subject to the following two conditions: 


(a) smoothness—H, depends differentiably on q; 


(b) consistency with the right action of G on E—if geG, H,, = H,). 


Ro xgl q 
It is not hard to see that if U is an open submanifold of B, and if we are givena 
connection in E, the restriction of this connection to 27 1(U )is a connection in 
the principal fibre bundle x~'(U) over U with structure group G, induced—as 
we say---by the connection in E. 

Let q+ H, bea connection in E. Relative to this connection, a vector at q is 
said to be horizontal if and only if it lies in H,. Any vector u € E, can be analyzed 
into a vertical component *ue V, and a horizontal component “ue H,, such 
that u = °u+"u. This terminology and notation naturally extends to vector 
fields on E. If X is such a vector field, we denote by ’X the vector field 
qt> °(X,). "X is defined analogously. X is said to be vertical if X = °X, 
horizontal if X = "X. Likewise a curve in E is vertical (horizontal) if all its 
tangent vectors are vertical (horizontal). Given a connection in the principal 
fibre bundle E we can define the exterior covariant derivative Da of a vector- 
valued r-form « on E. For any vector fields X,,..., X,,, on E 


Da(X,,..., Xp41) = da("X,. 2, X44) (C.3.1) 


If X is a vector field on the base manifold B there is a unique horizontal 
vector field X * on E, such that 2, X* = X. (The differential of x sends the value 
of X* at each qe E to the value of X at gq.) We call X* the lift of X.’ If y isa 
curve in B defined on an interval J, such that P = y(t) for some t € I, there exists 
for each pen‘ (P)a unique horizontal curve 9:1 > E, such that 9(t) = p.2 We 
call # the lift of y through p. Let y join P and Q in B. The lift of » through 
pen '(P) meets x~'(Q) at a definite point g. We say that q is parallel to p 
along y. By parallel transport along y from P to Q we mean the mapping that 
assigns to each element of x~!(P)its parallel along yin x7! (Q). We denote this 
mapping by T pg. It is not hard to see that, for any geéG, TPQ" R, a =R, ‘TQ. 
Hence, t} pg 1s a diffeomorphism of the fibre of x over P onto its fibre over Q. 

The notions of lift and parallel transport are readily extended to any fibre 
bundle E’ with typical fibre F, associated with the principal bundle (E, B, 2, C), 
if the latter is endowed with a connection. Since there is little danger of 
confusion I denote the projections of both E and E’ by a. We recall that each 
point of E’ can be equated with an equivalence class [q,5], with ge EandseF 
(p. 264), We denote by f, the mapping of E onto E’ by q:>[gq, s]. Let [p, t]e E’. 
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The vertical subspace V,,,] < Ef,,.] is simply the space of vectors at [p,t] 
tangent to the fibre of 2 over x[p,t]. We define the horizontal subspace 
H{p1] © Efp.1] as the image of H, < E, by the differential of f;: 


Hip} = Six (Ap) (C.3.2) 


It can be easily proved that H,,,,] is well defined—z.e. that it is independent of 
the particular choice of (p, t)¢[p, t]— and that H,,,,, and V_, ,] together span 
Ef») and have only the zero vector in common. A curve 7 in E’ is the lift of a 
curve yin Bify = x-. If y goes through Pe B there is a unique lift of y through 
each point in the fibre of x over P. Parallelism and parallel transport along y 
can therefore be defined in E’ just as in E. A cross-section o of E’ is said to be 
parallel (relative to the given connection) if the differential o,, at each Pe B, 
maps the tangent space Bp onto the horizontal subspace at o(P)EE' Ifoisa 
parallel cross-section of E’ and y is any curve in B through, say, Qe B, o-y is 
clearly the lift of y through o (Q ). Hence, all points of o-y are parallel along y by 
pairs. 

Let @, denote the natural isomorphism of the vertical subspace V, c E, 
onto the Lie algebra g of G (p. 269). @, is extended to a definite linear mapping 
w, of E, into g by stipulating that ker w, = H,. Due to the smoothness of the 
connection q+ H,, the correspondence w:q ++ w, depends smoothly on q 
and is therefore a g-valued 1-form on E. @ is called the connection form. 
Because the connection is consistent with R, we have that, for each géG, qe E 
and ve E,, 


Dag (Roxa 


v) = ad (g~*) a, (v) (C.3.3) 


On the other hand, if @ is any g-valued 1-form on E that satisfies (C.3.3) and is 
such that, for each qe E, w, agrees on V, with the natural isomorphism @,, the 
correspondence q++ker , is a connection in E. 

The curvature form of a given connection in E with connection form @ is the 
g-valued 2-form Q on E that is the exterior covariant derivative of w: 


Q= Dw (C.3.4) 


The connection and curvature forms satisfy the following equation of structure: 
for any vector fields X, Y on E 


dao(X, Y) = —4[@(X), w(¥)] + Q(X, Y) (C.3.5) 


Recalling the formula for the exterior derivative of a 1-form given on p. 268, we 
see that if X and Y are horizontal and L,Y stands for the Lie derivative of Y 
along X,@(L,Y) = —2Q(X, Y). The curvature form Q thus measures how the 
Lie derivative of an horizontal vector field along another departs from 
horizontality. If the curvature form is identically 0 the connection is said to be 


flat. 
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The curvature form Q satisfies the Bianchi identity: 
DQ=0 (C.3.6) 


In order to prove it, we choose a basis (e;) of the Lie algebra g and set w = wie; 
and Q = Nie, Let ci, be the structure constants of the structure group G 
relative to the basis (e,;) (p. 269). The equation of structure (C.3.5) is then 
expressed by the system 


dai = —5ci,@/ Aw +Qi (C.3.7) 

Consequently, by (1i) and (iv) on pp. 267. 
0 = dda! = —}c),(da/A aw —w/Ada*)+dQ' (C.3.8) 
Since all horizontal vector fields on Eare sent to zero by the connection form w 
and hence also by the forms w', dQ'(X, Y, Z) = 0 whenever X, Y and Z are 


horizontal. It follows at once from this and (C.3.1) that DQ vanishes 
identically. 


4. Linear Connections 


Let S be an n-manifold and let FS denote the principal bundle of n-ads over 
S(p. 264). As usual we denote the projection by z. A connection in FS is called a 
linear connection in S. 


Each qeFS is an ordered basis (q;,..., 4,) of the tangent space S,,. 
Consequently, if v is any vector at mq, v = v'q; for some list of real numbers 
(v', .. ., v"), called the components of v relative to g. The mapping of S,,into R" 


that assigns to each vector at mq the list of its components relative to q is a linear 
isomorphism, which we shall also designate by q. 
The canonical form 6 of FS is the R"-valued 1-form on FS defined by 


6,(b) = q' 2,6 (C.4.1) 


where b is any vector at ge FS. Note that @,(b) = 0 if and only if b is vertical. 
(Recall that, as x maps FS onto S, its differential z, maps the tangent bundle of 
FS onto that of S; all vectors tangent to the fibre of z over some PES are sent 
by z, to the zero vector in Sp.) 

Let us henceforth assume that °: q+ H,(qe FS) isa linear connection in S. 
We denote respectively by w and Q the connection form and the curvature 
form of T. wand Q take their values in the Lie algebra gl (n, R) of the structure 
group GL (n, R). Let E} denote the n x n matrix that has a 1 at the intersection 
of the i-th column and the j-th row, and a 0 everywhere else. The array of all 
such matrices is obviously a basis of gl (n, R). We may therefore set w = wiE', 
where the w/ are n? real-valued 1-forms on FS. Let x be a chart defined on 
U cS. There is an n-ad field on U which assigns to each PeU the n-ad 
(¢/€x'|p), <;<,. We denote this n-ad field by (@/éx). Set Gj = (2/Ax)* avi 
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(where (¢/0x)* stands for the pull-back of (@/0x), as defined on p. 316, n. 12). 
The jare real-valued 1-forms on U and can therefore be expressed as linear 
combinations of the 1-forms dx‘: 


i = Tijdx* (C.4.2) 


J 


The n? scalar fields Pi jare called, with some impropriety, the components of the 
linear connection T relative to the chart x. If y is another chart, defined on 
U' cS, and the components of I relative to y are denoted by I'’',, we have 
that,on UNU’, 
. Oyt Axi ax* = dy* d?x! 
12 = , C.4.3 
be = | ROI Dy dye T Ox! dyPaye (ca?) 


Ifa set of n° scalar fields (I ) is defined on the domain of each chart x of S, so 
that the diverse sets corresponding to the different charts are mutually related 
by (C.4.3), there is demonstrably a definite linear connection T in S whose 
components relative to any particular chart of S are the respective I"‘,.° 

The torsion form © of the linear connection is the R"-valued 2-form on FS 
which is the exterior covariant derivative of the canonical form 0: 


© = D@ (C.4.4) 


The torsion form @ is related to the canonical form @ and the connection form 
w by the equation of structure: 


dé (X, Y) = —4(@(X)0(Y) — w(Y)0(X)) + O(X, Y) (C.4.5) 
where X and Y are any two vector fields on FS. It can be shown that 
DO =QA0 (C.4.6) 


(Compare equations (C.4.4—6) with (C.3.4-6).) 

For each reR" there is a unique horizontal vector Bj at qe FS, such that 
6,(Bi) = r. The mapping q+ Bj is a horizontal vector field on FS, called the 
standard horizontal field corresponding to r, which we denote by B". It can be 
easily shown that if (e,,. . ., e,) is a basis of R", the standard horizontal fields 
B", .. ., B® are linearly independent. Through the natural isomorphism of the 
vertical subspaces of FS onto the n?-dimensional Lie algebra gl (n, R) (p. 269), 
we can define n? linearly independent vertical vector fields on FS. Together 
with the n linearly independent standard horizontal fields corresponding to 
any given basis of R" the said vertical fields form a basis of the module of vector 
fields on FS, which, if ordered, constitutes an (n?+n)-ad field on FS. 
Consequently, the existence of a linear connection in a manifold S entails that 
the principal bundle of n-ads FS is parallelizable. On the other hand, if FS is 
parallelizable we can always find an (n* +n)-ad field on FS with exactly n* 
vertical elements, and use the remaining n non-vertical ones for defining a 
connection. (Let the n-space spanned by the local value of these n non-vertical 
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vector fields at each ge FS be the horizontal space H, assigned to q by the 
connection.) 


5. Covariant Differentiation 


Let I’ bea linear connection in the n-manifold S. If V isa vector field and Wa 
(q,r)-tensor field defined on some neighbourhood of PeS, the covariant 
derivative VyW of W along V at P is given by: 


esi 
Vy Wp = lim — (tino) (W 2) — Wyo) (C.5.1) 
t0 


where y denotes the maximal integral curve of V through P (p. 000), yt 
abbreviates )(t), and t signifies parallel transport in the (q, r)-tensor bundle of S 
(pp. 265, 270). V,, W is plainly a (q, r)-tensor field defined on the intersection 
of the domains of V and W. It can be shown that, if W is also a vector field, 
like V, 

VvyW|p = 471 (V3 (0(W*))) (C.5.2) 


where V* and W* denote the lifts of V and W, respectively; g is any element of 
n~'(P)< FS; q™! is the inverse of the linear isomorphism that sends each 
vector at P = nq to its components relative to q, and @ is the canonical form of 
FS. We stipulate, moreover, that for any scalar field f, 


Wwf=Vf (C.5.3) 


It can be shown that, for any vector fields X and Y, any (q,r)-tensor fields W 
and W’, and any scalar field f, defined on S or on an open submanifold of S, the 
operator V defined by (C.5.1) satisfies the following four conditions: 


(i) VxzyWo = VxW+Vyw 

(ii) Vx(W+ W’) = VxyW4 Vx W' 
(iii) Vex W = fVxW em) 
(iv) Vx fW =fVxW+(Xf)W. 


On the other hand, the existence of an operator V that meets the above 
conditions determines a unique linear connection in S, relative to which V 
satisfies (C.5.1). It can be proved, moreover, that such an operator V is fully 
defined for all types of tensor fields on S, if it is defined for arbitrary vector 
fields on S, X, ¥, W and W’, and for arbitrary scalar fields in agreement with 
(C.5.3). One may thus endow a manifold S with a definite linear connection by 
introducing such an operator in ¥(S) and ¥'(S). The one-one correspon- 
dence between linear connections and covariant differential operators evid- 
ently authorizes us to call the former by the name of the latter, as it is often 
done. 
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The (absolute) covariant derivative of the (q,r)-tensor field W—often called 
the covariant differential of W—is the (g,r+ 1)-tensor field VW defined as 
follows: if X,X,,..., X, are any vector fields and w,,. . ., @, are any covector 
fields or 1-forms on the domain of W, 


VW (0045 6 «5 Og Xqy es XpeX) = (VW) (4, oy Oy Xy,-«  X,) 
(C.5.5) 


In view of (C.5S.1) and our remark about parallel cross-sections on p. 271, it is 
clear that the (q,r)-tensor field W is a parallel cross-section of the (q, r)-tensor 
bundle over its domain in S (relative to the linear connection given by V) only if 
Vx W = 0 for every vector field X on the domain of W. (C.5.5) entails then that 
W is parallel, relative to V, if and only if VW = 0. 

Let x be a chart defined on U c S. As we know, any vector field on U is equal 
toa linear combination of the vector fields 6/éx’. In particular, it can be shown 
that, if we allow V, to signify covariant differentiation along 0/0x*, 


V,0/éxi = Fi, a/ax! (C.5.6)!° 


where the ['j, are the components, relative to x, of the linear connection 
uniquely associated with V. From (C.5.4-6) one derives an expression for the 
components, relative to x, of the covariant derivative V W ofa (q,r)-tensor field 
W. For example, if W is a vector field equal (on U ) to W'G/0x', and we agree to 
write W!, for VW (dx', @/éx/), it will be readily seen that 


+Wwri, (C.5.7) 


(Note, by the way, that on U, V,W = W'.,é/0x'; so that the (i,j)-th 
component, relative to a chart x, of the absolute covariant derivative of a 
vector field is equal to the i-th component, relative to that chart, of its covariant 
derivative along ¢/¢x/.) The general expression, for a (q,r)-tensor field W, is 
given by equation (5.5.1) on p. 155. 


6. The Torsion and Curvature of a Linear Connection 


Let I be a linear connection in the n-manifold S. Its torsion form © and its 
curvature form Q determine respectively, a (1, 2)-tensor field on S, called the 
torsion and denoted by T, and a (1, 3)-tensor field on S, called the curvature and 
denoted by R. The definition of T and R can be made simpler and more 
perspicuous if we first observe that any (q, 1)-tensor field Q(g 21) on a 
manifold S can be regarded as a q-linear mapping of (¥'(S))* into ¥"'(S), 
insofar as Q assigns to each list of q vector fields on S, X,,. . ., X,, the linear 
mapping of ¥,(S) into R by w+> Q(a, X,..., X,). This linear mapping, 
which belongs to ¥'(S}the dual of ¥,(S}—is usually denoted by 
Q(X,,...,X,). By the same token, if g > 1, Q can also be regarded as a 
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(q —1)-linear mapping of (¥'(S))*~? into the linear space of linear mappings 
of ¥'(S) into itself, inasmuch as Q assigns to each list of g — 1 vector fields on 
S, X,,...,X,-1, the mapping X+> Q(X, X,,..., X,_1). This mapping is 
usually denoted by Q(X,,..., X,-1). We write Q(X,,...,X,-,)X for the 
value of this mapping at X e ¥''(S), ie. for the vector field Q(X, X;,.. ., Xq-1): 
Though the symbol Q is used hereby with deliberate and seemingly perverse 
ambiguity, no confusion will arise if due attention is paid to the nature and 
length of the list of objects to its right. 

The (1, 2)-tensor field T and the (1, 3)-tensor field R will be fully determined 
if we give the value at any Pe S of T(X, Y) and R(X, Y)Z, for arbitrary vector 
fields X, Y and Z, defined on a neighbourhood of P. We set 


T(X, Y)|p = q7'(20,(X*, Y*)) (C.6.1) 
R(X, Y)Z|p = q" 1 (2Q,(X*, Y*)(qZp)) (C.6.2) 


We use here the same nomenclature as in equation (C.5.2), with q an arbitrary 
n-ad at P; we write qZp for the value of gq at Zp—e. for the list of Zp’s 
components relative to q—; 20,(X*, Y*)(qZp) is then the image of qZp by the 
linear mapping 2Q,(X*, Y*) of R” into itself (recall that the curvature form Q 
is a gl(n, R)-valued 2-form). Our stipulations only make sense if the right-hand 
sides of (C.6.1) and (C.6.2) remain constant as qg ranges over 2 '(P). The proof 
is not difficult, but it would take up more space than we can afford here."! 
Due to the skew-symmetry of the curvature form Q we have that 


T(X, Y) = —T(Y, X) (C.6.3) 
and 
R(X, Y)Z = —R(Y, X)Z (C.6.4) 


It can also be proved that 
o(R(X, Y)Z) = o(T(T(X, Y), Z) + (VyT)(Y, Z)) (C.6.5)'? 


where o signifies the cyclic sum with respect to the three variables X, Y, Z. 
(Thus, o(R(X, Y)Z) stands for R(X, Y)Z+R(Y, Z)X + R(Z, X)Y.) 
From (C.6.1—2) and (C.5.4) one infers that 


T(X, Y) = VyY —VyX —[X, Y] (C.6.6) 
and 
R(X, Y)Z = Vy VyZ—VyVyZ— Vix. y]4 (C.6.7) 


where [X, Y] denotes the Lie derivative of Y along X. The components of T 
and R relative to a chart x can be obtained by straightforward computation 
from the above formulae and (C.5.6). If the components of the linear 
connection relative to x are denoted, as usual, by rin, the components of the 
torsion and the curvature are, respectively, given by: 


Ti, =Ti,-Ti, (C.6.8) 
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Ri, = OV ,/ox" — aVi ,/Ox' + Ps, Fi, PPh, (C.6.9) 


Direct calculation also yields the following two interesting results: if f is a 
scalar field and V is a vector field in the domain of x, 


Sng Sik = Tig hi (C.6.10) 
Vig — Via = Ri V+ TEV (C.6.11) 


where f,,; stands for (VVf)(0/dx*, 0/dx/), ete. 

If the torsion T = 0 everywhere, the linear connection is said to be torsion- 
free or symmetric. (The latter name stems from the fact that, by (C.6.8), T 
vanishes on the domain of a chart if and only if the respective connection 
components are symmetric in the two lower indices.) The linear connection is 
demonstrably flat, in the sense defined on p. 271, if and only if the curvature 
R is identically 0. 


7. Geodesics 


Throughout this section, S is an n-manifold with linear connection T. The 
concept of a geodesic extends to the vastly arbitrary structure (S, I) the 
intuitive idea of a straight or straightest line, a line that keeps its direction 
through its entire course. A curve y in S is a geodesic (of the linear connection 
I) if all vectors tangent to y are parallel to each other along y. More formally: 
Let y: 1 + Sbeacurve in S. Let X bea field of tangent vectors on y (i.e. let X bea 
vector field defined on a neighbourhood of each point in y’s range y(J), such 
that X,, = y,, foreach t€ I). y isa geodesic of the linear connection if and only if 
V,X vanishes everywhere on the range of y, that is, if and only if 


VxX = 0 (C.7.1) 


for each value of the parameter ¢. If z is a chart defined on a neighbourhood of 
y(J), it is clear that, on y(I), X = (dz'- y/dt)@/éz'. From (C.7.1) and (C.5.4) we 
infer then that y is a geodesic if and only if it satisfies the geodesic equations: 


= (C.7.2) 


where the I", are the components of the linear connection relative to z. 

Let y’ be a reparametrization of the geodesic y, ie., let y = y’-f, for some 
homeomorphism f of R onto itself. If Y is a field of tangent vectors on y’ and s 
denotes the new parameter, we generally have that 


VyY ly. = A(S)Y | (C.7.3) 


for some real-valued function 4. This implies that, if s, and s, are two values of 
the parameter s, y,,, 18 linearly dependent on the image of y,,, at y(s,) by 
paralle] transport along y, though it need not be strictly equal to it. Thus, every 
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reparametrization of the geodesic y may be said to “point constantly in the 
same direction”, even if it is not a geodesic according to our definition. Of 
course, if y’ is a geodesic, i.e. if the function 2 is identically 0, y,,, 1s the parallel 
along y to y,,, at y(s;)—in our sense of this expression. Such will be the case if 
and only if f is an affine transformation—at least on the domain of y—te. if 
and only if s = at + B(a, BER; « # 0). s is then said to be an affine parameter. 

Acurve yin S isa geodesic if and only if any lift of y is an integral curve of one 
of the standard horizontal fields of FS (p. 273). This fundamental theorem 
entails that for each point P in S and each vector v in Sp there is a unique 
geodesic y such that y(0) = P and y,9 = v. We call it the geodesic with initial 
condition (P, v). 

The linear connection in S is said to be complete if all its geodesics can be 
extended to geodesics defined on R. The fundamental theorem stated at the 
beginning of the foregoing paragraph entails that the linear connection is 
complete if and only if every standard horizontal field of FS is complete in the 
sense defined on p. 260. 

If the connection I is complete there is for each P € Sa mapping Expp of Sp 
into S, called the exponential map at P. This is defined as follows: for each 
ve Sp, let denote the geodesic with initial condition (P, v); the 


Expp(v) = 6(1) (C.7.4) 


If is not complete, the exponential map Expp is defined by (C.7.4) only on the 
set of vectors v at P, such that the geodesics # with initial condition (P, v) are 
defined at least on [0,1]. It is not hard to see that this set constitutes a 
neighbourhood of 0 in Sp. In any case, there is a neighbourhood of 0 in Sp, 
included in the domain of Expp, which Expp maps diffeomorphically onto a 
neighbourhood of P in S. On the latter neighbourhood, Expp has an inverse, 
which we denote by Expp’. If q denotes an n-ad at P, and hence also the 
respective isomorphism of S, into R" (p. 272), g: Expp* is a chart defined on 
the said neighbourhood of P. Such a chart is called a normal chart or a system 
of normal coordinates at P. Let the components of the linear connection 
relative to a normal chart at P be denoted by I’j,. It can then be shown that 
i,(P) = —Tj,(P). Hence, if T is symmetric, the Ii, vanish at P.’° 

Though each linear connection in S determines its set of geodesics, the same 
set can pertain to diverse connections. Indeed, we have the following 
important result. Let I be as hitherto an arbitrary linear connection in S. 
Let t be a fixed real number such that 0< +f <1. For every component 
Ti, of T relative to each chart x of S define, on dom x, a scalar field 
M,= 0 +U-or,; The arrays (Ij) corresponding to each chart 
x determine a linear connection I’ on S (p. 273). The geodesic equations 
(C.7.2) imply that [ and I’ share the same set of geodesics. Observe, in 
particular, that by choosing t = 1/2 we can obtain a symmetric connection 
that has the same geodescics as a given linear connection. 
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8. Metric Connections in Riemannian Manifolds 


Let S be an n-manifold with a proper (i.e. positive definite) Riemannian 
metric g (pp. 93,100). An orthonormal n-ad at PeS is an ordered basis 
(v,,...,0,) of Sp such that 


gplv;, 0)) = Oi (C.8.1) 


(that is, 1 ifi = jand Oifi # j). The set ofall orthonormal n-ads at each Pe S is 
made in an obvious way into a principal fibre bundle OS over S with structure 
group O(n)—the orthogonal group. (Cf. the construction of FS on p. 264.) For 
each connection I’ in (OS, S, O(n)) there is one and only one connection [in 
(FS, S,GL(n, R)) such that the canonical injection of OS into FS sends the 
horizontal subspaces of I’ into the horizontal subspaces of I.'* A linear 
connection in the Riemannian manifold S is said to be metric if it is the unique 
connection in FS which is thus determined by some connection in OS. It can be 
shown that a linear connection I in S is metric if and only if the Riemannian 
metric g is parallel relative to I’, in the sense defined on p. 271 (Recall that gisa 
cross-section of the (0, 2)-tensor bundle over S.) Hence, by a remark on p. 275, 
I is metric if and only if 


Vg =0 (C.8.2) 


(where V signifies covariant differentiation in (S, I°)). 

Among the metric connections that can be defined in an n-manifold S with 
proper Riemannian metric g there is one and only one that is symmetric or 
torsion-free. It is called the Levi-Civita connection of (S, g). The geodesics of 
this connection, in the sense defined on p. 277, are precisely the 
Riemannian geodesics of (S, g), in the sense defined on p. 94. Let x be a chart of 
S. Let F denote the Levi-Civita connection of (S, g). Its components relative to 
x are given by: 

Vig = 39" (0Gnj/Ox* + 8Gun/OX! — 09 jx/O") (C.8.3) 


Here the g,, = g(@/éx', d/0x/) are the components of g relative to x, while the 
g‘i are defined by the relations 


9D jx = 0x (C.8.4) 


By definition, the torsion of I is identically 0. Its curvature R is demonstrably 
equal to the (1, 3)-Riemann tensor of (S, g). More precisely, if R denotes the 
covariant tensor field on S constructed from the metric g in the manner 
proposed by Riemann in 1861 (p. 145), and X, Y, Z and W are any four vector 
fields on S, 


R(W, Z, X, Y) = g(W, R(X, Y)Z) (C.8.5) 


Thus, the (0,4) Riemann tensor R is none other than the covariant form of the 
curvature R (p. 100). If, as is customary, we look on two tensor fields related to 
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one another as R is to R by (C.8.5), as two manifestations of the same 
geometrical object, we must obviously identify the Riemann tensor of a 
Riemannian manifold with the curvature of its Levi-Civita connection. We 
therefore use the same letter R to denote them both. (C.6.4) implies that 


R(W, Z, X, Y) = —R(W, Z, Y, X) (C.8.6) 


Using (C.6.2) and (C.8.5) one proves that 
R(W, Z, X, Y) = —R(Z, W, X, Y) (C.8.7) 


As the Levi-Civita connection is torsion-free, (C.6.5) leads at once to the first 
Bianchi identity: 


R(W, X, Y, Z)+R(W, Y, Z, X)+R(W, Z, X, Y) =0 (C.8.8) 
Equations (C.8.6-8) yield, by straightforward calculation, that 
R(W, Z, X, Y) = R(X, Y, W, Z) (C.8.9) 
The second Bianchi identity, 
VxR(Y, Z)W + V\R(Z, X)W+VZR(X, Y)W = 0 (C.8.10) 


can be derived from the Bianchi identity (C.3.6), which is automatically 
satisfied by the curvature form of the Levi—Civita connection. 

The covariant Ricci tensor is the contraction of the curvature with respect to 
its only contravariant and its third covariant indices. The curvature scalar is the 
contraction of the mixed Ricci tensor.'* 

The above results can be extended to Riemannian manifolds with indefinite 
metrics. In particular, all statements in this Section 3.8 hold good for an 
n-manifold S with a Lorentz metric g (i.e. with a metric of signature 2 — n; see 
page 93), provided that we replace 6,; in equation (C.8.1) by 1 if i = j = 0, and 
by —6,,; otherwise, and that we substitute L(n) for O(n) and LS for OS. Here 
L(n) denotes the general homogeneous Lorentz group, which we can define as 
follows: Let H be the n x n matrix with its first diagonal element equal to 1, its 
other diagonal elements equal to —1, and all other elements equal to 0; L(n) is 
the subgroup of GL(n, R) represented by all non-singular matrices A which 
satisfy the equation 

ATHA=H (C.8.11) 


(where A! is the transpose of A). 

In the light of our remark at the end of Section C.6 (p. 277) it is clear that a 
Riemannian manifold (S, g) if flat in the sense of page 145 if and only if the 
Levi-Civita connection T of (S, g) is flat in the sense of page 271. In other 
words: all sectional curvatures vanish at each point of S if and only if the 
curvature form of I is identically zero. . 
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D_ Useful Formulae 


I collect here some formulae for easy reference. Definitions have been given 
elsewhere. Indices range from 0 to 3. Repeated indices imply summation over 
their range. 

Let g;; be the metric components relative to a particular chart x of a 
relativistic spacetime. Let g be the determinant of the matrix (g,;). If G,; is the 
cofactor of g;; in g, we write g’ for G;;/g. 

A comma followed by one or more indices will signify partial differentia- 
tion with respect to the corresponding coordinate functions. For example, 
Gij, kh = 6 g;j/Ox*0x". 

The components of the Levi-Civita connection relative to x are: 


Vig = 39" Griz + Gur, j — Ginn) (D.1) 
The components of the (1,3) Riemann tensor are: 
Rix = Tks. j a Vien + ris, 7 Vi Tks (D.2) 


The components of the covariant Riemann tensor are: 


Rai jx = Ghs Rijs 


a FGnx, ii + Gij. he — Ghj, ik —Gix,nj) + Isr(V is The _ a) (D.3) 
The components of the Ricci tensor are: 
Ri; = Rijs 
= 7(log 9), —Vijst Ms ry; —3T%; (log 9), (D.4) 


The curvature scalar is: 
Ri, = giR,, (D.5) 


The modified field equations of General Relativity, proposed by Einstein in 
1917, are: 
Ri; — 49, = —K(Ti; ~39:;Th) (D.6) 


Or, equivalently: 
R,,—39;;Ri+ 4g; = —KTi; (D.7) 


The original or unmodified field equations, proposed by Einstein in 1915, 
are obtained by setting 4 = 0 in (D.6). 

On the range of a geodesic y, within the domain of x, the coordinate 
functions x! satisfy the following differential equations, for each value of the 
affine parameter t: 


dx* 


Sar} + Vix (yt) 
ye yt 


dt 
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The symmetries of the Riemann tensor (C.8.6—9) entail that it can have no 
more than 20 independent components relative to any given chart. 10 of them 
are combined into the Ricci tensor and are algebraically tied by the field 
equations (D.6) to the energy tensor of matter. The remaining 10 components 
determine the 10 independent components of the Weyl tensor—introduced by 
Hermann Weyl (1918a, p. 404)}—which are given by: 


Cy jan = 2Rijnn + GR int GinRix — Gj Rin — Gin t SRG i Gin —GiG jn) 
(D9) 


It can be seen at a glance that the Weyl tensor has the same symmetries as the 
Riemann tensor. A quick calculation shows that 


Ce = 9" Ci inn = 0 (D.10) 


Notes 


Motto 


* The quotation from Riemann is taken from a posthumous philosophical fragment (Riemann 
(1902), p. 521). It may be translated as follows: “Physical science is an attempt to grasp Nature by 
means of exact concepts.” The Einstein quotation is part of the Herbert Spencer lecture, delivered 
at Oxford on June 10th, 1933 (Einstein (1934), p. 183). It can be approximately rendered thus: 
“Hitherto our experience entitles us to trust that Nature is the realization of the mathematically 
simplest. | am convinced that through a purely mathematical construction we can find the 
concepts and the nomic connections between them which furnish the key to the understanding of 
natural phenomena.” [All references to the literature are decoded on pp. 351 ff.] 


Introduction 


1. G.F. B. Riemann (1867); delivered as a public lecture in 1854. F. Klein (1893); first published 
in 1872. | examine the ideas of Riemann and Klein and their impact on late !9th-century thought in 
my book Philosophy of Geometry from Riemann to Poincaré, hereafter referred to as Torretti 
(1978a). See also below, pp. 26f. (on Klein), 93 ff, 143 ff. (on Riemann). It is worth observing that 
the Erlangen Programme does not encompass Riemann’s theory of metric manifolds, for the 
group of isometries of a Riemannian manifold generally consists of the identity alone, and is 
therefore powerless to characterize its structure. Nevertheless, the group-theoretical approach has 
been fruitfully applied to Riemannian geometry, at a higher level (so to speak), through the theory 
of fibre bundles. See for example, the remarks on metric connections on pp. 279f. 

2. The action of a group on a set is defined below, on p. 26. 

3. Compare Klein (1893), p. 67: “Es ist eine Mannigfaltigkeit und in derselben eine 
Transformationsgruppe gegeben. Man entwickele die auf die Gruppe bezigliche 
Invariantentheorie.” I should add perhaps that in Klein’s investigations the set underlying the 
group transformations is not quite abstract; it is a “number manifold” (Zahlenmannigfaltigkeit) 
with an inherent topology, inherited from the real line R or the projective line Ru {x }. 

4. Compare Poincare (1887), p. 216: “Geometry is nothing else but the study of a group, and in 
this sense one might say that the truth of Euclid’s geometry ts not incompatible with that of 
Lobachevsky’s geometry, for the existence of a group is not incompatible with the existence of 
another group. Among all possible groups we have chosen one, to which we refer physical 
phenomena, just as we choose three coordinate axes, to which we refer a geometrical figure.” 

5. Klein (1911), p. 21: “Was die modernen Physiker Relativitatstheorie nennen, ist die 
Invariantentheorie des vierdimensionalen Raum-Zeit-Gebietes x, y, z, ¢ (der Minkowskischen 
‘Welt’) gegentiber einer bestimmten Gruppe von Kollineationen, eben der ‘Lorentzgruppe’.” 
Cf. Minkowski (1915), delivered as a public lecture in 1907, and the Introduction to the second 
volume of Klein (1967). 

6. One such study is the learned book by Arthur I. Miller (1981) on the emergence and early 
interpretation of Special Relativity. Unfortunately it appeared too late to be of more than cosmetic 
use in the preparation of this one. However, it caused me to take a new look at my sources wherever 
I noted a factual discrepancy between Miller's story and mine. 

7. A more detailed yet sufficiently concise treatment of the main topics of Chapter 2, with 
schematic descriptions of the principal experiments, will now be found in A. I. Miller (1981), 
Chapter 1. 

8 Namely E. Guth (1970); M. A. Tonnelat (1971), Chapters IX and X: C. Lanczos (1972), 
J. Mehra (1974a,4), V. P. Vizgin and Ya. Smorodinskifi (1979). I have indeed learnt much from 
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them, but my exposition is definitely intended as an alternative to theirs. A more complete 
coverage of some special points will be found in the articles of J. Earman and C. Glymour 
(1978a, b; 1980a, b), J. Stachel (1980a, b), and E. Zahar (1980). 

9. B. F. Schutz explains most of these concepts at a more leisurely pace in his lucid and very 
readable book, Geometrical Methods of Mathematical Physics (1980). (Schutz collects, moreover, 
in Chapter | and in Section 2.22 of his book many of the definitions from analysis and linear 
algebra that I take for granted.) The reader who seriously wishes to study modern differential 
geometry would in my opinion do well to master the first two volumes of Spivak’s Comprehensive 
Introduction to Differential Geometry (hereafter referred to as CIDG), and then to turn directly to 
Kobayashi and Nomizu’s Foundations of Differential Geometry (FDG). Though undeniably 
difficult, Kobayashi and Nomizu’s magnificent treatise is much more perspicuous and thus more 
rewarding than other books, allegedly designed for student use. 

10. As presented, say, in Spivak’s Calculus on Manifolds. 

11. Chapters VI-IX, XI and XVI of MacLane and Birkhoff’s Algebra (1967) contain all that is 
needed—and much more. Nomizu’s Fundamentals of Linear Algebra (1979) is probably more 
suitable for self-study. 

12. As developed, say, in A. Lichnerowicz, Elements of Tensor Calculus (1962), or in J. L. Synge 
and A. Schild, Tensor Calculus (TC). 

13. A. Einstein (1954), H. Bondi (1964), E. F. Taylor and J. A. Wheeler (1963). J. G. Taylor’s 
Special Relativity (1975) is a short and very clear introduction to the theory. See also in this series, 
R. B. Angel, Relativity: The Theory and its Philosophy (1980). 

14. Two quite attractive recent introductory expositions of General Relativity are C. J. S. 
Clarke’s Elementary General Relativity (1979) and J. V. Narlikar’s General Relativity and 
Cosmology (1979). The serious student must eventually turn to Weinberg’s Gravitation and 
Cosmology (1972), Misner, Thorne and Wheeler’s Gravitation (1973) and Hawking and Ellis’s The 
Large Scale Structure of Space-Time (1973). These are as of now the great Relativity treatises of 
our generation, in the grand tradition of Weyl’s Raum-Zeit-Materie (1919a), Eddington’s The 
Mathematical Theory of Relativity (1923), Tolman’s Relativity, Thermodynamics and Cosmology 
(1934), Fock’s Theory of Space, Time and Gravitation (1959), and Synge’s Relativity: The General 
Theory (1960). 


Chapter 1 


Section 1.1 


1. Newton, PNPM, p. 16. 
2. Newton, “De Gravitatione et Aequipondio Fluidorum”, Def. 4, in Hall and Hall (1978), 
p. 91. 

3. Newton, “De Gravitatione .. .”, Def. 5, in Hall and Hall (1978), p. 114. 

4. Hall and Hall (1978), p. 114. 

5. Definition I11: “The innate force of matter is the power of resisting by which every body, as 
much as in it lies, perseveres in its state, either of rest, or of moving uniformly in a straight line.” 
Definition IV: “An impressed force is an action exerted upon a body, in order to change its state, 
either of rest, or of moving uniformly in a straight line” (Newton, PN PM, pp. 40, 41). Newton’s 
comments on Definition III raise some problems that we must ignore here; see McMullin (1978), 
pp. 33-43 and notes. 

6. Newton, PNPM, p. 41. 

7. Newton, PNPM, p. 54. As a matter of fact, Newton says that “the change of the motion 
(mutatio motus)”’—not its rate of change—is proportional to the motive force impressed. This 
means that Newton’s “impressed force” is what we now call “impulse” and is not synonymous with 
the current understanding of “Newtonian force”. Newton’s statement of the Second Law should 
therefore be transcribed, in modern notation, as FAt = A(mv). However, if one goes to the limit 
At + 0-—as Newton does, e.g., in his proof of Prop. I, Th. I (PNPM, pp. 88 ff.}—the foregoing 
equation entails that F = d(mv)/dt, which is the familiar version of the Second Law I gave above. 
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To avoid distracting the readers with niceties irrelevant to our present purpose, I shall hereafter 
understand Newton’s “impressed force” as an instantaneous, accelerative force. 

8. Newton PNPM, pp. 55f. 

9. Hall and Hall (1962), p. 243. 


Section 1.2 


1. Aristotle, Physica, IV, 4; 212220. 

2. Descartes, Principia, 11, 15 (AT, VII, 48 f). See also Regulae, XII (AT, X, 426). 

3. Descartes, Principia, II, 25 (AT, VIII, 53). 

4. Hall and Hall (1978), p. 104. 

5. The reader is referred to the excellent books by Max Jammer (1954) and Alexander Koyré 
(1957). 

6. For greater precision, let us recall that the geometry of Euclid (the set of consequences of all 
the propositions of the Elements) can be derived from the axioms proposed in the first edition of 
Hilbert’s Grundlagen der Geometrie and can therefore be given a denumerable model (Hilbert, GG, 
§9, pp. 34 ff.). This, however, will not satisfy the needs of a Newtonian physicist, who describes 
motions by means of differentiable mappings of time intervals into space. He requires a model of 
the stronger theory axiomatized in the second edition of the Grundlagen, in which Hilbert added 
the “Axiom of Completeness”. The transition from the Greek “geometry of ruler and compass” to 
what is in effect the geometry of a flat Riemannian manifold homeomorphic to R® cannot be said 
to have occurred at any particular date. If, as I have argued elsewhere (Torretti (1978a), pp. 34 ff), 
Descartes’ algebra of segments is that of a complete ordered field, the transition was completed 
before the publication of Descartes’ Géométrie (1637). Hereafter, when we speak of Euclidean 
geometry, we shall understand it in its Cartesian sense, that is, as the theory axiomatized in the 
second and subsequent editions of Hilbert’s Grundlagen. A model of this theory is a Euclidean 
space. 

7. Herein lies, in my opinion, the essential difference between the philosophies of space of 
Newton and Leibniz; not in Leibniz’ thesis that space is a relational structure, to which Newton, as 
the full-blooded mathematician that he was, certainly subscribed. Newton writes, in the important 
early manuscript from which I have already quoted: “Just as the parts of time obtain their 
individuality from their order, so that, e.g., if yesterday could change places with today [. . .] it 
would no longer be yesterday but today, thus the parts of space are individualized by their 
position, so that if any two could exchange their positions they would at once exchange their 
identities and each would become the other. It is only by their mutual order and positions that the 
parts of time and space are understood to be the very same which in truth they are, and they do not 
possess any principle of individuation apart from this order and these positions” (Hall and Hall 
(1978), p. 103; cf. Leibniz, Fifth Paper to Clarke, §47, GP, VII, 400 ff). 

8. For Newton, a point is the dimensionless limit of a limit (line) of a limit (surface) of a part of 
space; see Hall and Hall (1978), p. 100. 

9. Hall and Hall (1978), p. 99. 

10. To avoid misunderstandings I wish to emphasize that the statement “The rigid body Bexists 
all by itself in Newtonian space and is presently moving” is meaningless not because it is 
empirically unverifiable but because it is unintelligible: there is no conceivable difference between 
the situation described by this statement and the one that would be described by saying that the 
same body Bis at rest. On the other hand, it makes perfect good sense to say that an isolated mass- 
point moves in a Riemannian space of non-constant curvature, even if by its very isolation it is 
unobservable ex hypothesi. The varying curvature of the space furnishes a criterion for identifying 
the successive positions of our mass-point. 

11. Newton, PNPM, p. 50. Readers unacquainted with the water-bucket and the two-globe 
experiments ought to take a look at Newton’s text. It will be found in English translation in 
Newton (Cajori), pp. 10-12; also in Smart (1964), pp. 85 ff.; in Capek (1976), pp. 100 ff., etc. It is 
worthwhile noting that the experiments can tell us, with the aid of dynamical principles, that the 
system in question moves, but not how it moves. Thus, in the water-bucket experiment, the relevant 
phenomena are the same if the bucket’s axis is at rest and each particle of water moves about it ina 


RAG ~ J* 


286 RELATIVITY AND GEOMETRY 


circle or if the bucket’s axis travels with a constant speed in a fixed direction perpendicular to itself 
and the water particles meander along its path. 


Section 1.3 


1. Aristotle, Physica, IV, 11; 21952. On the meaning of arithmos, which I have rendered as 
“measure” (though it usually means an integer greater than 1), see ibid., IV, 12; 220427-32; also, 
22032. Note that “motion” (kinesis) includes not only locomotion (phora), but growth and 
qualitative alteration as well. 

2. Aristotle, Physica, IV, 12; 220514-16. 

3. Aristotle, Physica, IV, 14; 223>22-23. 

4, Aristotle, Physica, IV, 10; 218°3-5; IV, 12; 22055; cf. 218513, 223>11. 

5. Newton, PNPM, p. 46; Hall and Hall (1978), p. 355. 

6. Newton, PNPM, p. 46. 

7. Newton, PNPM, p. 48. 

8. Credit for the first attempt at “demetaphysicizing” Newtonian space and time is usually 
given to Euler, but I am not impressed by his efforts in this direction. See the relevant texts, in 
English translation, in Koslow (1967), pp. 115—25, and Capek (1976), pp. 113-19; cf. also the 
critical discussion in L. Lange (1886), pp. 87-97. 

9. Aristotle, Physica, IV, 10; 218513. 

10. Hall and Hall (1978), p. 104. Not every critic of mechanics failed to mention this important 
assumption, however. James Thomson (1884a) posits it explicitly; see the quotation on p. 294, 
note 5. 

11. See, for instance, A. Sommerfeld (1964), pp. 71 f. 

12. The centrifugal force on the dynamometer is proportional to the angular velocity squared. 
A purist might object that by attaching an elastic contraption to our rotating system we violate the 
requirement that it be rigid. The usual answer to this objection would be that by taking a 
sufficiently small dynamometer the violation can be made negligible. Such an answer is sensible 
enough when one is doing physics, but I am not sure that it is admissible in a discussion of the 
conceptual framework upon which the very possibility of doing physics rests. At any rate, the 
objection can be easily disposed of: According to Newton’s principles, if our system rotates 
beyond the range of all external forces and the dynamometer is a permanent fixture of it, the shape 
and the angular velocity of the system will be constant. 


Section 1.4 


1. Newton, PNPM, p. 46. 

2. That is, if R? is endowed with the standard Euclidean metric, by which the distance from 
(a,,4), 43) to (by, bz, by) equals | (Z,(a, —b,)?)*”? |. 

3. The meaning of this can be readily explained if x and y are based on tetrahedra with pairwise 
congruent faces. Then, the tetrahedra themselves are congruent if x and y are similarly oriented. 
Otherwise, the tetrahedra are mirror images of each other. 


Section 1.5 


1. C. G. Neumann (1870), p. 14. 

2. C. G. Neumann (1870), p. 15. 

3. C.G, Neumann (1870), p. 26, Neumann suggests that the Alpha body might be represented 
by the principal axes of inertia of the universe, but adds that this representation lacks substance 
(Inhalt) because it cannot be supported or shaken by empirical evidence. 

4. C. G. Neumann (1870), pp. 16, 18. 

5. C. G. Neumann (1870), p. 18. See above, p. 12. 

6. See Corollary V to the Laws of Motion, quoted on p. 19. 
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7, L. Lange (18855), pp. 337 ff. Lange speaks of “inertial systems” (Inertial-systeme). I take 
from J. Thomson (1884a) both the word “frame” and the concept I express with it. 

8. Readers unfamiliar with the epithets in parenthesis should not be put off by them. The last 
two make it possible to define a differential calculus on the space. See, for instance, H. Cartan 
(1967). 

9. df (0, t)/ét is the derivative of a mapping of R into S;- and is therefore a mapping of R into 
£ (R, Sf), the space of linear mappings of R into (S--., 0’). # (R, S¢-) is canonically equated with 
(Sp, 0‘) by setting each we S, in correspondence with the linear mapping 4-+A4u (AER). This 
correspondence is an isomorphism of the normed vector spaces (Sf, 0’)and ¥ (R, S--) (H. Cartan 
(1967), p. 20). The following result is used in the derivation of equations (1.5.2) and (1.5.4): If gisa 
differentiable mapping of R into S; and fis a differentiable mapping of S; into S;- and h denotes 
the composite mapping f:g, the derivative h'(a) of h at aéER is the composite linear mapping 
S'(g(a))+g'(a), where f‘(g(a)) is the derivative of f at g(a)e S- and g'(a) is the derivative of g at a 
(see H. Cartan 1967), p. 31). 

10. In particular, if k is the resultant of several component forces, 5 (k) is the resultant of the 
images of those components by fj. 

11. Newton, PNPM, p. 63. 

12. See M. Strauss (1972), p. 21. 

13. The velocity with which the frame F, moves in the frame F, is properly represented by a 
vector in the relative space of the latter frame. This vector is mapped on v — u by the isomorphism 
that takes each point of the space of F, to the point of S- with which it coincides at the time when 
the zero of that space coincides with the zero of S;. Such irritating niceties are unavoidable if we 
desire to speak with precision and yet to abide by the present artificial approach that refers the 
unity of becoming to a plurality of spaces moving inside one another. Fortunately, in Section 1.6 we 
shall be able to take a more natural approach. 

14. Newton, PNPM, p. 64. Curiously enough, Corollary VI is usually ignored in critical 
discussions of Newtonian relativity. See however H. Dingler (1921), pp. 152 ff.; E. Cartan (1923), 
p. 335n. 3. H. Stein (19774), p. 19 f. suggests that Mach’s silence on this matter is a proof of his tact, 
and explains how the apparently relativistic implications of Corollary V] are cancelled in Newton’s 
theory by the Third Law of Motion. (I follow him in the text above.) Stein correctly points out 
that, in the light of its use in Newton's Principia, Corollary VI is tantamount to Einstein’s 
Equivalence Principle (see the proof of Prop. I], Th. II of Bk. lin Newton, PNPM, p. 94). 

15. McMullin (1978), Chapter II, shows that Newton probably shared the view, common in his 
time, that matter is essentially passive and that all action in the world is due to immaterial agents. 
But the Third Law clearly entails that, even if every action on matter is really due to goblins, 
phenomenally it must be traceable to a material object, with which the acting goblin is somehow 
associated, and on which the goblin associated with the matter acted upon will immediately react. 

16. The criterion furnished by the Third Law does not, of course, amount to an “operational 
definition” of a freely moving particle and an inertial frame. In the above example, the acceleration 
of B by 2’s reaction will generally be only a component of ’s total acceleration and it might not be 
easy to discern it. But the criterion surely bestows a definite, intelligible meaning on the italicized 
expressions. 

17, The (strictly) inertial force and the centrifugal force also show up as stresses in bodies fixed 
in the frame in question. The Coriolis and d’Alembert forces appear on particles in motion in a 
rotating frame and depend, respectively, on the velocity of the particle and the angular velocity of 
the frame, and on the rate of change of the latter with time. See Landau and Lifshitz (1960), p. 128, 
eqn. 39.7. 


Section 1.6 


1. See also K. Friedrichs (1927), P. Havas (1964), A. Trautman (1965), pp. 104 ff., (1966). H. 
Stein (1967) drew the attention of philosophers to the concept of Newtonian spacetime, which has 
subsequently found its way into the better philosophical books in the field. See, for instance, Sklar 
(1974), Nerlich (1976). 

2. Einstein (1956), p. 31. 

3. Galilei, Discorsi, EN, VII, 272. 
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4. D’Alembert, article “Dimension” of the Encylopédie (vol. IV, 1754), Lagrange (1797), p. 223. 
The original texts were transcribed by R. Archibald (1914) in the Am. Math. Soc. Bulletin. See also 
Kant (1770), p. 17 n. (in Kant, WW, IT, 401 n.). 

5. A. M. Bork (1964) collects and discusses several passages of late 19th-century physical 
literature in which reference is made to a fourth dimension. Practically all of them refer to a fourth 
spatial dimension, not to time as a fourth dimension. Reading Bork’s quotations one has the 
impression that the authors were inhibited from conceiving a spacetime by their assumption that a 
four-dimensional manifold—as Maxwell put it—‘“has four dimensions all coequal” 
(J. C. Maxwell to C. J. Monroe, 15th March 1871; quoted by Bork (1964), p. 329.) 

6. According to H. Freudenthal (1968), p. 227, non-metric differential geometry was created by 
Sophus Lie (1842-99). 

7. See, for instance, H. Poincaré (1893), and my comments in Torretti (1978), pp. 345 f. 

8. Hereafter, I shall call the set of arguments at which a mapping f takes a particular value a, 
the fibre of f over a. 

9. The said mapping sends the value of X at p to the value of Z at the point of Sg with which 
p coincides at time 0. It must be a rotation or a rotation-cum-parity of R?, because the value of x! 
at (0,0, 0) coincides at time 0 with the value of 27! at (0, 0, 0). (Parity means reflection in (0, 0, 0)). 

10, The nine numbers a,, must satisfy the six conditions of orthogonality Z,a,,ag, = d,,. 

11. Let be such a hyperplane and let 6 = y:x~' be a Galilei coordinate transformation. That 
6(c)||o follows from the fact that y° = x° +k°. A glance at equations (1.6.3) should persuade us 
that, if ABCD is any tetrahedron in o, 0(A BCD) differs from a perpendicular projection of ABCD 
on 0(c) only by a translation followed by a rotation. The restriction of @ to o is therefore an 
isometry of ¢ onto (a). Note also that both o and 6(c), as hyperplanes of R*, are isometric to R’, 
with the usual metric, and are therefore copies of Euclidean space. 

12. An n-manifold S is said to be analytic if the coordinate transformations between the charts 
of its maximal atlas are analytic, i.e. if they can be expanded about each point in their domain into a 
convergent power series. Analytic mappings of one analytic manifold into another are defined like 
differentiable mappings between ordinary n-manifolds (p. 258). Let G be a group which is also an 
analytic n-manifold. G is an n-parameter Lie group if the group product is an analytic mapping of 
G x G into G. The proper orthochronous Galilei group is a connected 10-parameter Lie group. 
(Connected in the topological sense, i.e. not the union of two disjoint non-empty open sets.) The 
full Galilei group is obtained in the same way, if we lift the restrictions (ii) and (ili) that we imposed 
on the charts of Ag on p. 25. The full Galilei group will therefore include the transformations 
(t, x, y, z)-+(—t, xX, y, z) (time reversal), and (t, x, y, z(t, —x, —y, —2z) (parity) that were 
excluded by those restrictions. The full Galilei group is a disconnected: 10-parameter Lie group. 

13. In the case described, # is more accurately said to act on S “on the left”. F is a “left action” 
H# acts on S “on the right” and F isa “right action” if F (gh, s) = F (h, F (g, s)), for every g, he # and 
every sé S. If F isa right action it is evidently convenient to write sg for F (g, s). The reader will not 
fail to notice that, for any group G, the group product constitutes a transitive and effective left 
action of G on itself. 

14. See, for instance, Sudarshan and Mukunda (1974), pp. 255 ff. 

15. See, for instance, Lie (1893), pp. 130 f., 216 f. Note that the arbitrary element of Gp defined 
by equations (1.6.3) is an isometry if and only if the parameters v” are all equal to 0, that is, if no 
change of frame is involved. 

16. See p. 191. A pleasantly simple intrinsic characterization of the geometry of Newtonian 
spacetime will be found in H. Field (1980), Chapter 6. 

17. Herivel (1965), p. 149. 

18. Galilei, Dialogo, EN, VII, 212 f.; quoted after Stillman Drake's translation (Galilei (1967), 
pp. 186 f.). A passage very similar to the above occurs in Galilei’s letter to Francesco Ingoli of 
September 1624 (EN, VI, 547f.). 


Section 1.7 


1. Newton, PNPM, p. 433 (Bk. II, Prop. XXIX, Th. XIX, Cor. VI); p. 572 (Bk. ITI, Prop. VI, 
Th. VI). Newton’s result has been confirmed of late to one part in 10'*, a degree of precision rarely 
achieved in physics. See p. 310, note 5. 
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2. Note, however, that, if F(y) is the vector representing the force of 2 on 1 in terms of y, 
F(x) = A~'F(y), and therefore differs from F(y) unless A is the identity matrix, that is, unless 
x-y~! does not involve a spatial rotation. The magnitudes | F(x)| and | F (y)| are equal, of course, 
because A is orthogonal. 

3. Newton to Bentley, 25th February 1693; in Newton (1756), pp. 25 f. 

4. Newton, PNPM, p. 764 (General Scholium to Bk. III). 

5. Cf. Boscovich (1759), Kant (1756, 1786). 

6. See, for instance, Robertson and Noonan (1968), pp. 16-18. H. Stein (1970b) forcefully 
argues that the field-theoretical approach to gravity can be traced back to Newton’s own thought 
and words. 


Chapter 2 


Section 2.1 


1, Whittaker (1951/53), vol. I, pp. 53, 56f. 

2. Ampére (1827). English translation and commentary in Tricker (1965). Philosophical 
commentary—in French—in Merleau-Ponty (1974). 

3. Ampére, in Tricker (1965), p. 155. 

4. The electrostatic unit (esu) of charge is defined as the charge which, placed at unit distance in 
vacuo from an exactly equal charge, repels it with unit force. The electromagnetic unit (emu) of 
charge is the charge carried in unit time by one emu of electric current, the latter being defined as 
follows: (i) the emu of magnetic pole strength is the strength of a magnetic pole which, placed at 
unit distance in vacuo from an exactly equal pole, repels it with unit force; (ii) the emu of current is 
the current carried by either of two exactly equal closed circuits whose mutual potential is the same 
as that of two magnetic shells of unit pole strength bounded by the circuits. (The mutual potential 
of two circuits or shells is the work done by the field on one of them when it is transported parallel 
to itself from infinity to its present position, while the other remains where it is.) That the ratio c of 
the emu to the esu must have the dimension [length/time] can be seen by the following 
consideration due to Maxwell (TEM, §768): Let two parallel conducting wires be so placed that 
they attract each other with a force equal to the product of the currents /, 1’, conveyed by them. 
The amount of charge transmitted by the current J in time ¢ is {t [emu] or /tc [esu]. Let two small 
conductors be charged with the electricity transmitted by the currents / and J’, respectively, in time 
t, and place them at a distance r, such that the repulsive force between the conductors exactly 
balances the attractive force between the conducting wires. Then []' = II‘ t?¢?/r?. Hence c = r/t. 

5. See Whittaker (1951/53), vol. I, pp. 201 ff.; Woodruff (1962). Helmholtz objected to the fact 
that the electric forces were made to depend on the velocities and not only on the positions of their 
sources. He claimed—wrongly—that Weber’s theory violated the Conservation of Energy. He 
also pointed out—correctly—that in particular circumstances it predicted some physically 
implausible situations (involving, be it noted, faster-than-light motions—cf. Maxwell, 
TEM, §854). 

6. Maxwell, TEM, §§ 846 ff.; cf. §529 and Preface, pp. VIII, X. 

7. Hertz, EW. See S. D’Agostino (1975). 

8. Maxwell, TEM, p. IX. 

9. Cf. Faraday, ERE, §3071; Maxwell, TEM, § 47. 

10, Maxwell, TEM, pt. IV, ch. IX. Maxwell's original equations are transcribed in vector 
notation in Gillespie, DSB, vol. IX, p. 212. The slightly simpler system published in Maxwell (1865) 
is transcribed in vector notation in Berkson (1974). p. 176. 

11. Weber and Kohlrausch’s value was 3.1074 x 10° [cm/s] Fizeau’s value for the speed of light 
was 3.14 x 10!° [cm/s]. In 1890 J. J. Thomson and Searle improved the value of the ratio of the emu 
to the esu to 2.9955 x 10'° [cm/s]. Michelson, in 1882, had obtained 2.9976 x 10'° [cm/s] for the 
speed of light. Let me add that in his original formulation of equation (2.1.1) Weber used 
electrodynamic instead of electromagnetic units; the constant that turned up was therefore equal to 
c./2, not a particularly suggestive number. (Note that in the formula of Weber's force law in A. I. 
Miller (1981), p. 92, c stands for this peculiar constant; not for the speed of light.) 
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12. Maxwell (1865), §§ 15, 16; my italics. Cf. §74. It is worth noting that Faraday, who after 
discovering the magnetic rotation of the plane of polarization of light in 1845 became convinced 
that light was electromagnetic by nature, expected his lines of force to “supply for the vibrating 
theory the place of an ether”. “The view which I am so bold as to put forth considers, therefore, 
radiation as a high species of vibration in the lines of force which are known to connect particles 
and also masses of matter together. [t endeavours to dismiss the ether, but not the vibrations.” (My 
italics.) With unfailing instinct he remarks that “a uniform medium like the ether does not appear 
apt, or more apt than air or water” for the propagation of purely transverse waves (Faraday, 
“Thoughts on Ray Vibrations”, ERE, vol. IIE, p. 450). On Faraday’s final steps to field theory, see 
D. Gooding (1981). 

13. Maxwell (1861/62). 

14. Maxwell (1865), §73. 

15. Maxwell, TEM, §574. 

16. Maxwell, TEM, §570; cf. §552. See Poincaré’s very clear explanation (SH, ch. XII, 
pp. 219-25): By choosing a potential energy function U, depending on some observable 
parametres q,, and a kinetic energy function 7, depending on the q, and their time derivatives q,, 
such that the Lagrange equations (d/dt) (6(7 — U )/dq;) = 0(T —U )/6q,; agree with the data of 
experience, Maxwell proved that it was possible to give a mechanical explanation of electricity— 
indeed, infinitely many such explanations. See also Larmor (1900), §§ 49 ff., pp. 82 ff. Green (1838) 
and MacCullagh (1848) had already used Lagrangian methods in dealing with the optical aether. 

17. See Schaffner (1972), pp. 68 ff. Schaffner is probably right in thinking that Kelvin meant his 
models merely as illustrative mechanical analogies. 

18. Quoted by Schaffner (1972), p. 90, from the original letter in the Deutsches Museum, 
Munich. Heaviside’s suggestion is the core of the so-called electromagnetic conception of matter, 
proclaimed by W. Wien (1900) and much favoured in the first decade of the 20th century, especially 
after W. Kaufmann’s alleged experimental proof of the purely electromagnetic origin of inertial 
mass. See R. McCormmach (1970). On Kaufmann’s experiments, see A. I. Miller (1981), pp. 47-54, 
61-7, 228-32; see also E. Zahar (1978) for a subtle logical analysis of the interpretation of 
Kaufmann’s 1905 results. 


Section 2.2 


1. See, for instance, Hertz, EW, p. 246. Quoted on p. 293, note 17. 

2. Maxwell, TEM, § 601. “Electromotive intensity”, or the force ona particle with unit charge, 
is given according to Maxwell by the equation E = vx B~A-—V*y, where v is the particle's 
velocity, B the magnetic induction, A the vector potential and the electrostatic potential. The 
change made necessary by the transformation described above consists in adding to the potential 
y the scalar field wy’ = — A: w, where wis the linear velocity of the moving frame with respect to the 
fixed frame. Now OV w'-ds = 0, whence Maxwell’s conclusion. 

3. Towards the end of his life Maxwell proposed a method for calculating the velocity of the 
solar system through the aether from the velocity of light measured by Roemer’s method at 
periods when Jupiter lay in opposite directions from the earth. Though, as it turned out, 
astronomy was then unable to meet the exacting demands of Maxwell’s method, he saw no other 
way because, in his opinion, the motion of the solar system through the aether could not be 
ascertained in the optical laboratory. “In the terrestrial methods of determining the velocity of 
light—he wrote to D. P. Todd, of the U.S. Nautical Almanac Office—the light comes back along 
the same path again, so that the velocity of the earth with respect to the ether would alter the time 
of the double passage by a quantity depending on the square of the ratio of the earth’s velocity to 
that of light, and this is quite too small to be observed” (Maxwell to Todd, 19th March 1879; 
Maxwell (1879/80), p. 109). Soon thereafter, though unfortunately not before Maxwell’s death, 
A. A. Michelson, also of the U.S. Nautical Almanac Office, devised an interferometer that was 
capable of detecting just such tiny effects. 

4. Hertz (1908a); English translation in Hertz, EW, pp. 195-240. Arnold Sommerfeld 
remarked years later: “It was as though scales fell from my eyes when J read Hertz’s great paper” 
(Sommerfeld (1964), p. 2). 

6. Hertz, EW, pp. 241 f. 
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7. Hertz, EW, p. 242. 

8. Hertz’s version of the Maxwell equations is transcribed in vector notation in A. I. Miller 
(1981), pp. 11 f. The change they undergo when referred back from a moving system to the 
laboratory rest frame is stated and briefly explained on p. 12. For a less laconic explanation, see 
Silberstein (1924), pp. 30 f., 56 ff. 

9. W.C. Roentgen (1885, 1888, 1890); A. Eichenwald (1903); H. A. Wilson (1905). For a short 
discussion of these experiments, see Laue (1955), pp. 11 f. 

10. Let h be the height of the objective above the eyepiece. Then the time the corpuscle takes to 
travel from the former to the latter is hc~ '. In the same time the eyepiece travels the distance vhc~'. 
Clearly tan a = vhc~'/h = v/c. Though Bradley analyzes the same special case proposed here, he 
derives a formula valid in the general case, in which light might not fall vertically. Let @ be the real 
and @’ the apparent inclination of the star. Then v/c = sin(@—6')/sin 6’. 

11. Stokes (1845); reprinted in Schaffner (1972), p. 131. 

12. Schaffner (1972), p. 137. 

13. Predicting this null result was no mean achievement, “for if the tangent of the angle of 
aberration is the ratio of the velocity of the earth to the velocity of light, then since the latter 
velocity in water is three-fourths its velocity in a vacuum, the aberration observed with a water 
telescope should be four-thirds of its true value” (Michelson and Morley (1887), p. 449). 

14. Michelson (1881); reprinted in Kilmister (1970), p. 103. 

15. Michelson and Morley (1887), p. 459. The Supplement at the end of this paper suggests that 
the aether drag may diminish on hill-tops, an effect subsequently studied by Michelson with null 
results. This suggestion, however, does not stem from a lingering faith in Stokes’ hypothesis, but 
from the consideration that —as Michelson wrote to Lord Rayleigh—the aether may be dragged 
at sea level by the nearby mountain chains, for “Fizeau's experiment holds good for transparent 
bodies only, and J hardly think we have a right to extend the conclusions to opaque bodies” (Letter 
of 7th March 1887, reproduced by Shankland (1964), p. 29). The same consideration motivates the 
proposal made by the authors in the said Supplement that the experiment be repeated using a glass 
cover instead of a wooden one on the instrument. 

16. I have found the writings of T. Hirosige (1969, 1976) and K. Schaffner (1969, 1976) 
particularly useful as a guide to Lorentz’s thought. A. I. Miller (1981), Chapter 1, is also very 
illuminating, but was not available when I wrote this Section. 

17. Lorentz (1892a), in Lorentz, CP, vol. II, p. 206. Where Lorentz writes “ce nom [scil. 
‘matiére’-—R. T.] sera donc appliqué a l’ether tout aussi bien qu’a la matiére pondérable”’, 
S. Goldberg (1969), p. 984, translates: “The name ether will be applied to all else besides 
ponderable matter.” Goldberg’s false rendering suggests that Lorentz, like some of his British 
contemporaries, believed the aether to be immaterial. 

18. Lorentz (1892a), in CP, vol. II, p. 228. 

19. Lorentz (1892a), CP, vol. II, p. 246. The vectorial notation used above was adopted by 
Lorentz later. 

20. Lorentz (1898), in CP, vol. VII, p. 114. Cf. Poincaré (1900), and Lorentz’s reaction in his 
letter to Poincaré of 20th January 1901 (published by A. I. Miller (1980), pp. 70 f.). 

21. McCormmach (1970), p. 467, gives the following intuitive explanation of how, according to 
Lorentz, a refringent body impresses on light a fraction of its speed, without actually dragging the 
aether: “When a ray strikes the first charged particle of a dielectric, Lorentz’s electrical force is 
called into play at once, displacing the particle from its equilibrium position. Because of its motion, 
the particle becomes the center of an electromagnetic disturbance which propagates outward at 
the speed of light, and this new pulse interferes with the already existing state of the ether. This 
sequence is repeated at the second charged particle and so on; and the resulting superposition of 
primary and induced vibrations accounts for the slowing and bending of light inside dielectrics.” 

22. Lorentz (1895), in CP, vol. V, p. 4. Lorentz adds: “If I say for brevity’s sake that the ether is at 
rest I only mean to say that one part of this medium does not move with respect to the other and 
that all perceivable motions of the celestial bodies are relative motions with respect to the aether.” 

23. Lorentz (1892a), in CP, vol. I], pp. 292, 297. See also Schaffner (1969), pp. 501 f.; Zahar 
(1973), p. 112. 

24. As the reader will surely notice, equations (2.2.5) differ but slightly from the standard 
Lorentz transformation applicable in a situation like the one described above. Cf. equations 
(3.4.18) on p. 61. 
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25. The above is a paraphrase of the conciser statement in Lorentz (1895), CP, vol. V, p. 84. For 
a short proof of the Theorem, see Whittaker (1951/53), vol. I, pp. 406 f. 

26. Lorentz (1895), CP, vol. V, p. 90. Lorentz (1899) restates the Theorem of Corresponding 
States in almost the same terms, but understands it in a somewhat different sense. The primed 
coordinates are now related to a Galilei chart adapted to the rest frame of the moving material 
system by the transformation (2.2.9) on p. 47, which is similar to, but not the same as (2.2.4). See 
Lorentz, CP, vol. V, pp. 149 f. 

27. Lorentz (1916), p. 321. See, however, Lorentz (1899), in CP, vol. V, p. 150; Lorentz (1899*), 
quoted on p. 47 and n. 35. Lorentz (1916), p. 224, clearly implies that local time, as defined in that 
book, is the time that natural clocks in fact keep. Joseph Larmor arrived independently to a similar 
conception—see, for instance, Larmor (1900), p. 179. 

28. Lorentz’s letter to Lord Rayleigh is quoted by Schaffner (1972), p. 103. 

29. Lorentz (18925), in CP, vol. IV, p. 221. 

30. These paragraphs are reproduced in the well-known collection, Lorentz et al., RP. 

31. Fora short discussion of the Michelson- Morley experiment see, for instance, A. P. French 
(1968), pp. 49-57, 63-4. 

32, More significant than the stationary oscillations of the molecules of a solid body at rest or 
moving inertially in the aether is the change in the oscillations that should result from a change in 
velocity if indeed the molecular forces holding the body together are altered thereby. Wood, 
Tomlison and Essex (1936) tested this consequence of the Fitzgerald--Lorentz contraction 
hypothesis in a quartz rod, obtaining a null result. 

33. Larmor (1897) had anticipated him (see especially loc. cit., pp. 221, 230), but there is no 
evidence that Lorentz was aware of his work. 

34, Lorentz (1899), in CP, vol. V, p. 152. 

35. Lorentz (1899*), in Schaffner (1972), p. 290. Lorentz adds: “The transformation of which I 
have now spoken is precisely such a one as is required in my explication of Michelson’s experiment. 
In this explication the factor ¢ may be left indeterminate. We need hardly remark that for the real 
transformation produced by a translatory motion, the factor should have a definite value. 1 see 
however no means to determine it.” These passages of the English version of Lorentz (1899}--- 
which in all likelihood was written or at least revised by Lorentz himself.---do not occur in this 
form in the French version in CP, vol. V. 

36. Lorentz (1899), in CP, vol. V, p. 154. 


Chapter 3 


Section 3.1 


1. Einstein’s letter to Habicht is reproduced in C. Seelig (1954), p. 75. 

2. Einstein (19056), p. 133; English translation in D. ter Haar (1967), p. 92. 

3, “Die vierte Arbeit liegt erst im Konzept vor und ist eine Elektrodynamik bewegter Korper 
unter Benutzung von Modificationen der Lehre von Raum und Zeit” (Einstein to Habicht, loc. cit.) 
The remaining two papers dealt with molecular dimensions and Brownian motion, and 
contributed decisively to the final acceptance of the atomic view of matter (Einstein (1905a, b). The 
best available English translation of Einstein (1905d) is to be found in A. I. Miller (1981), 
pp. 392-415. 

4. Shankland (1963), p. 56. 

5. Einstein (1949), p. 44. 

6. Einstein (1949), p. 46. 

7. Einstein's readiness to accept this conclusion may have been due, in part, to the influence of 
Ernst Mach. See Hirosige (1976), pp. 57 ff. 

8. That Einstein was well aware of this implication of his (1905b) can be gathered from the 
introductory remarks to his paper of 1907 on the inertia of energy (Einstein (1907a), pp. 372 ff.). See 
also Einstein's letter to Max von Laue of 17th January 1952 (quoted by Holton (1973), pp. 201 ff). 
In 1955 Einstcin wrote to Carl Seeliy that the novelty of his (1905d) with respect to Lorentz (1904) 
and Poincaré (1905, 1906) consisted in “the realisation of the fact that the bearing of the Lorentz 
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transformation transcended its connection with Maxwell's equations and had to do with the 
nature of space and time in general. [. . .] This was for me of particular importance because I had 
already found that Maxwell's theory did not account for the micro-structure of radiation and could 
not therefore have general validity” (quoted by Born (1956), p. 194). 

9. On the contrast between “phenomenological” and “deep” theories, see Bunge (1964). 

10. Einstein (1949), p. 52. 

11. Einstein (1949), p. 56. 

12. De Sitter (1913a,5,c) published observations of binary stars that were long regarded as 
proof that the speed of light is not affected by the velocity of its source. This interpretation of de 
Sitter’s results is questionable, however, if, as suggested by J. G. Fox (1962), the double stars are 
surrounded by a cloud of gas which does not partake of their motion and which absorbs and 
retransmits all the radiation we receive from the said stars. Terrestrial experiments by G. C. 
Babcock and T. G. Bergman (1964) and T. Alvager et al. (1964) have confirmed Einstein’s LP 
most convincingly. More recently, K. Brecher (1977) has verified the LP to | part in 10°, by the 
observation of X-ray sources in binary star systems. (Such sources are less liable than ordinary 
light sources to Fox's strictures.) 

13. Let the inertial frame F move with velocity v in the inertial frame F. According to the Galilei 
Rule, a light pulse travelling in F with velocity ck (where k is a unit vector), travels in F’ with 
velocity ck + v. If vy and c are given, |ck + v| depends on k. Consequently, if the speed of light is 
constant in every direction in F, it cannot be so in F’. 

14. Einstein (1949), p 52. See also M. Wertheimer (1959), p. 214. W. Berkson (1974), p. 296, 
recalls that Oliver Heaviside had worked out a theory concerning the field generated by electrons 
moving with velocities equal to or greater than c. 

15. Einstein (1905d), p. 891; English translation in Kilmister (1970), p. 187. A. I. Miller (1981), 
Chapter 3 analyses this passage in depth and comments on its historical background. 

16. Einstein (1905d), p. 891. This is the only text in Einstein (1905d) that might refer to 
Michelson and Morley’s experiment. Of course, it can also refer to other, more recent second- 
order experiments (for example those by Rayleigh (1902), Trouton and Noble (1904), Brace 
(1904)), or even to the first-order experiments of Hoek (1868, 1869), Airy (1871), Mascart 
(1872/74), etc., all of which yielded null results. On the role of Michelson and Morley’s experiment 
in the genesis of Special Relativity, see Holton (1969), Shankland (1973), Einstein (1982). 

17, In Hertz’s theory, “the absolute motion of a rigid system of bodies has no effect upon any 
internal electromagnetic processes whatever in it, provided that all the bodies under consideration, 
including the ether as well, actually share the motion” (Hertz, EW, p. 246). 

18. Einstein (1905d), p. 892; English translation in Kilmister (1970), p. 188. 

19. Shankland (1963), p. 49. Ritz’s theory is discussed by J. G. Fox (1965). 


Section 3.2 


1. The thesis that Special Relativity forgoes the universality of natural time is maintained not 
only by such well-known conventionalists as Hans Reichenbach and Adolf Griinbaum, but also, 
with varying degrees of emphasis, by A. A. Robb (1914), p. 6; H. Weyl (1923), pp. 144 f., 163 ff.;A.S. 
Eddington (1923), pp. 15 f., 27; L. Silberstein (1924), p. 92; R. B. Lindsay and H. Margenau (1936), 
pp. 75, 334; P. G. Bergmann (1942), p. 32 n.; H. P. Robertson (1949), p. 380; C. Mller (1952), p. 32; 
H. Arzeliés(1966), p. 19; W. Rindler (1977), pp. 28 f., etc. The thesis that Einstein merely questioned 
the uniqueness of natural time, and that in Special Relativity each inertial frame is furnished by 
nature with a universal time of its own, is implied by M. von Laue (1955), pp. 6f., 30 f.; and 
probably also by L. D. Landau and E. M. Lifschitz (1962), § 1, whose treatment of the question is 
not altogether clear to me. It also underlies the recurrent proposals for measuring the one-way 
speed of light without making any stipulations about distant synchrony (see, for instance, 
E. Feenberg (1974)), or for basing such stipulations on “our intuitive understanding of 
simultaneity” (F. Jackson and R. Pargetter (1979), p. 311). It is worth noting that near the end of his 
life Einstein still regarded the knowledge that “there is no simultaneity of distant events” as one of 
the “definitive insights” that we owe to Special Relativity (Einstein (1949), p. 60). 

2. My proposal is based on Einstein’s own practice, in his more mature writings; thus Einstein 
(1916), p. 772, characterizes “a Galilean system of reference” as one relative to which “a mass, 
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sufficiently distant from other masses, moves with uniform motion in a straight line” (English 
translation in Kilmister (1973), p. 144); see also Einstein (1954), p. 11. The footnote added to 
Einstein (1905d), §1, in Lorentz et al., RP, p. 27, to the effect that the stationary system is one in 
which the equations of Newtonian mechanics “hold good in a first approximation”, avoids the 
inconsistency at a rather high price, for the concept of a stationary frame thereby becomes 
irretrievably ambiguous. 

3. Einstein (1911), § 4. 

4. Einstein (1905d), p. 892; English translation in Kilmister (1970), p. 189. 

5. “Men have very good means of knowing in some cases, and of imagining in other cases, the 
distance between the points of space simultaneously occupied by the centres of two balls; if, at 
least, we be content to waive the difficulty as to imperfection of our means of ascertaining or 
specifying, or clearly idealising, simultaneity at distant places. For this we do commonly use signals 
by sound, by light, by electricity, by connecting wires or bars, or by various other means. The time 
required in the transmission of the signal involves an imperfection in human powers of 
ascertaining simultaneity of occurrences in distant places. It seems, however, probably not to 
involve any difficulty of idealising or imagining the existence of simultaneity” (James Thomson 
(1884a), p. 569). 

6. Einstein (1905d), p. 894; Kilmister (1970), p. 19. 

7. He probably also assumed tacitly that synchronizations performed by his method do not 
depend on the time at which they are carried out. Let § (A, B, C, t) signify that two clocks at rest at 
points A and B of a given inertial frame run synchronously according to Einstein's criterion, when 
the latter is applied by means of light-signals sent from a point C of the same frame at time ¢ 
(measured by a clock at C). Then we must expect that: (1) S(A, B, A, tp) at some definite time f, only 
if, for all ¢, S(B, A, B,t); (2)S(A, B, A, to) and S(A,C, A,t, }—at definite times tg and ¢,—jointly 
imply S (B,C, B,¢), for all t. I presume that this is the meaning of the symmetry and transitivity of 
synchronism postulated by Einstein near the end of his (1905d), § 1; for if it only meant that the 
four-place relation S is symmetric and transitive in its first two terms-—when the last two are 
fixed—it would follow trivially from the very meaning of synchronism and would not constitute 
an additional assumption, as Einstein says. 


Section 3.3 


1. Let f be the Lorentz chart described above. Then f is defined by the following set of 
relations: t = m‘f, x = 2, ‘f, y = m,'f, z = %3'f; where 7; is the i-th projection function on R*, 
which maps each real number quadruple on its (i+ 1)-th element. Note that the Lorentz charts 
mentioned in the above statement of the RP can of course be adapted to different inertial frames, 
but must all employ the same units of length and time, embodied in physically equivalent 
standards at rest in the respective frames. 

2. Einstein (1905d), p. 895; Kilmister (1970), p. 191. The above statement of the RP is an almost 
literal translation of Einstein’s. I have only substituted “Lorentz charts” for “coordinate systems in 
uniform translatory motion relative to each other”. The two expressions are not synonymous, but 
the one used by Einstein covers more systems than he actually intended, e.g. systems whose time 
coordinate functions are related by equation (3.2.1). My version of the LP is less literal but is 
entirely faithful to the meaning of the original. It is understood that the LP refers to light 
transmitted through empty space. 


Section 3.4 


1. Einstein himself derives the Lorentz transformation equations from the RP and the LP bya 
seemingly simpler method in his (1954), Appendix I; by a more elegant one in his (1956), pp. 29 ff. 
The original derivation in his (1905d), §3 has been analysed by J. Merleau-Ponty (1974), R. B. 
Williamson (1977), M. Jammer (1979) and A. I. Miller (1981). 

2. The latter assumption is not necessary for Einstein’s argument and is not made by him. We 
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introduce it here only because it will save us some tedious explanations. I recall in passing that 
today’s metrology defines the unit of time asa given multiple of the period, and the unit of lengthas 
a given multiple of the wavelength of some standard monochromatic light signals. If, contrary to 
current fact, the standards adopted in both definitions were the same the value of c would be fixed 
by metrological convention. 

3. Conditions (ii) and (iii) can be stated in a less simple but probably more familiar fashion, in 
terms of the Cartesian coordinate systems x and J. (it) says that, for « = 1,2, 3, the a-th axes of both 
Cartesian systems lie on each other and are similarly oriented at time x° = 0. (iii) says that the 
origin of j, ie. the point P of G such that ¥(P) = (0,0,0), moves in frame F with velocity 
components dx! (P)/dx° = v, dx? (P)/dx° = dx>(P)/dx°® = 0. As to condition (iv), it is designed 
to exclude time reversal. As M. Jammer (1979), p. 212, aptly remarks, the rejection of time reversal 
is implicit in Einstein’s argument for eqn. (3.4.16); see p. 61 and note 9, below. 

4. Einstein (1905d), p. 898; Kilmister (1970), p. 194. 

5. If we express Einstein’s formulae in our notation, we shall see that he sets x*(A) = y7(A) 
= 0, but x° (A) = 1. However, by setting x° (A) = y° (A) = 0 we do not risk any loss of generality. 

6. Einstein designates the auxiliary coordinate function u! and the parameter i = u' (B) by the 
same variable x’. When he asks us to choose x’ “infinitesimally small” he means the parameter. 

7. Note Einstein’s careful reasoning. The LP was assumed to be valid in F, whence F is to be 
regarded as the “stationary system”. On the strength of the RP, one can then infer that the LP is 
also valid in G. Observe also that equation (3.4.3) is introduced as a consequence of the definition 
of the Lorentz chart y, adapted to G. 

8. Anarbitrary point (42, a,b,c) is equal to (t, vt,0,0) + (t,a — 3vt, 0,0) + (t, vt, b, 0) + (¢, vt, 0, c), 
which is a linear combination of points of the four kinds considered above. 

9. By initial conditions (i) and (ii), the «-th axes of x and z—for all values of a—lie on each 
other and are similarly oriented at time x° = 0, and hence at all times. By condition (iv), time 
reversal is excluded. Since x and z are determined by measurements performed with clocks and 
unstressed rods at rest on the same frame, they fully agree with each other. 

10. Einstein (1905d), p. 902; Kilmister (1970), p. 198. Einstein's “reasons of symmetry” can be 
spelled out as follows: by means of equations w® = x°, w! = —x!, w? = x7, w? = x3, we intro- 
duce a new Lorentz chart adapted to F, relative to which G moves with velocity components 
(—v,0,0); at time w° = 14, the distance between the positions of the rod’s endpoints in F is 
A/p(—v) = Ww? (Q, 1) —W? (P, 0) = X7 (0,0) —X?(P,t) = A/e(v). 

1]. See, for instance, Arzeliés (1966), pp. 72 ff; Robertson and Noonan (1968), p. 45; Meller 
(1952), pp. 41 ff. The condition Z,v? < c? is discussed on pp. 71, 296 n. 9; equality is of course 
excluded because no inertial frame can move in another one with the speed of light (p. 57). 

12. It is evident that this collection of matrices represents a group, for (i) it includes the identity 
matrix /; (ii)if A and B belong to it, (AB)’ H(AB) = B’(A’HA)B = B'HB = H,so that ABalso 
belongs to it; (iii)if A7HA = Hand AB = 1, B'HB = B’A'HAB = (AB)'H(AB)=1' HI = H. 

13. The construction of the homogeneous Lorentz group as a Lie group is explained, for 
instance, in Kobayashi and Nomizu, FDG, vol. II, pp. 268 ff., Example 10.2. A proof that it is 
indeed a Lie subgroup of GL (4, R) can be given on the lines of Chevalley’s proof of this property 
for the rotation group SO(n). (Chevalley (1946), pp. 119 f., Proposition 2.) 

14, Ifa group G acts on the left on a set S (p. 26 and p. 288, n. 13), the orbit of an object seS 
under this action is the set { gs|geG }. 

15. Consequently, a succession of two or more pk transformations between Lorentz charts 
adapted to different, non-collinearly moving inertial frames necessarily involves a rotation of the 
space coordinate axes. This structural feature of the Lorentz group explains the physical 
phenomenon known as the Thomas precession. See, for instance, Arzeliés (1966), pp. 178 f.; 
Meller (1952), pp. 54 ff. 

16. Robertson and Noonan (1968), p. 43; Mller (1980), p. 476. 

17, Einstein evidently regarded both formulations as equivalent. See his (19115), p. 1. 

18. Moller (1952), p. 4, nicely combines both versions of the RP: “All physical phenomena 
should have the same course of development in all systems of inertia, and observers installed in 
different systems of inertia should thus as a result of their experiments arrive at the establishment 
of the same laws of nature.” The frame that matters is, of course, not the one in which the observer 
is installed, but the one to which he actually refers his observations. They need not be the same. 
Thus, terrestrial astronomers use the fixed stars as their frame of reference. 
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19. See, for instance, the articles “Parity” and “CPT Theorem” in Lerner and Trigg’s 
Encyclopedia of Physics (1981). 


Section 3.5 


1. C. Giannoni (1971), p. 580, observes that “the very concept of measuring a moving rod while 
not moving with it requires a conceptual innovation”. Apparently nobody thought of such an 
operation before Einstein who with characteristic sureness lighted on the only plausible way of 
performing it. Since Q—in our example—is moving in F,, it would be quite senseless to consider 
the distances between non-simultaneous positions of its vertices. 

2. The relation between C and C, as they meet for the second time is not governed by the 
transformation y:x~', but by the transformation y:x'~!, between y and a suitable chart x’ 
adapted to Fo. 

3. Prokhovnik (1967), p. 41, attributes this opinion to Herbert Dingle. 

4. The problem we have been considering was first raised by Einstein himself, in his (1905d), 
p. 904, and has since been discussed incessantly, under the name of “the Clock Paradox” or “the 
Paradox of Twins”. Marder (1971) gives a bibliography of some 250 items, to which more could be 
added now. The effect predicted by Einstein has been verified by observations of the decay rate of 
fast particles (B. Rossi, and D. B. Hall (1941), J. Bailey et al. (1977)) and of the behaviour of atomic 
clocks (J. C. Hafele and R. E. Keating (1972), C. O. Alley et al. (1977)). Good textbook analyses of 
the problem will be found in A. P. French (1968), pp. 154-9, and J. L. Anderson (1967), pp. 174-8. 
R. P. Perrin (1979) gives detailed calculations based on the realistic assumption that C undergoes 
acceleration during a brief but finite period as it changes frames. 

5. Eqn (3.5.5) was first derived from Einstein's Rule by Laue (1907). 

6. Einstein (1905d), §6. See also Larmor (1900), pp. 173 ff; Lorentz (1904); Poincaré (1905, 
1906); Minkowski (1908); Einstein and Laub (1908a, b). Cf. pp. 109, 115 ff. 

7. Einstein (1905e), (1906), (1907a, b), Planck (1906, 1907), Lewis and Tolman (1909), Frank 
(1909), Born (1909a, b, c; 1910), Laue (1911, 19115), Epstein (1911), Tolman (1912), etc. 

8. For example, if in (3.5.4) we take r=0.99c, w, =0.99c, and w,=w,; =0, then 
Zu? = 0.999899 c?, 

9. By definition, the four coordinate functions of a Lorentz chart must be real-valued, for they 
consist of three Cartesian space coordinate functions that measure distances along their 
parametric lines and a time coordinate function whose values must be linearly ordered. But if 
£,v? > c? in equations (3.4.19) some at least of the coordinates related by them would be complex 
valued. 

10. On tachyons and related matters see H. Schmidt (1958); S. Tanaka (1960); O. M. P. Bilaniuk 
et al. (1962); G. Feinberg (1967); Y. P. Terletskii (1968), Chapters V and VI; F. A. E. Pirani (1970); 
R. Fox et al. (1970); and E. Recami and R. Mignani (1974). P. Fitzgerald (1971) and J. Earman 
(1972b) address themselves to philosophical readers. Cf. also the interesting remarks in G. N. 
Whitrow (1980), pp. 356-60. 


Section 3.6 


1. If this is granted, the linearity of all Lorentz coordinate transformations follows easily, for 
each inhomogeneous Lorentz coordinate transformation is the product of an homogeneous one 
by a translation of R*. 

2. See, for instance, Iyanaga and Kawada, EDM, 358.B; cf. also 201, 248 f. The transitive 
action of a group on a set was defined on p. 26. Note that if S is homogeneous with respect to G, 
the orbit of each x € S, i.e. the set { gx |g € G}, is identical with S. The action of G does not therefore 
partition S into several distinct classes. 

3. They are also homogeneous relatively to their respective isometry group, but this fact is 
irrelevant to the question of linearity. 

4. By the choice of an arbitrary point as zero, an affine space is made into a vector space, 
isomorphic with the one whose additive group acts on it. Cf. p. 18. 
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5. Remember that if Vand W are vector spaces over the same field, their direct sum V@W is the 
vector space formed by all ordered pairs (v, w)—for each v € V, we W—with the linear operations 
defined by: 


a(0,,W,) + B(v2, Wz) = (av, + Buz, aw, + Bw) 


for any vectors v,, v2 € V, w,, w2 € W, and any pair of scalars a, B. 

6. To prove this statement we may assume, without significant loss of generality, that the time 
coordinate functions y° and y9 are identical, that the space coordinate functions yj and y% 
(a = 1,2,3) agree at time y® = 0, and that the angular velocity of F, in F, is the unit vector in the 
direction of increasing y}. The transformation is then given by the non-linear system: 


y yi = yicos y§ — y3sin y$ 
y y 


Nm NO 


ee HO 


y 
y 3 sin y$ + y3cos y$. 

7. The following proof is given by Berzi and Gorini (1969), p. 1524. If 2 is a positive integer, 
(3.6.2) follows from (3.6.1) by induction, if we set k = v. As f(0) = 0, we have that f(—v) = —f(v). 
Thus, (3.6.2) holds for any integer A. If A = y/v, where yp and v are integers, set pv = vw. Then 
uf (v) =f (pv) =f (vw) = vf (w). Thus, (4/v)f(v) =f((u/v)v), and (3.6.2) holds for any rational 
number 4. Assume now that fis continuous at 0. (3.6.1) entails then that fis continuous everywhere. 
Let A be any real number and (A,) a sequence of rationals converging to A. Then, as n goes to 00, A,v 
converges to Av and A, f(v) converges to Af (v). But 4, f(v) = f(4,v), which, by continuity, converges 
to f (Av). Thus, (3.6.2) holds for any AER. 

8. The argument on p. 75 can be traced back to Lalan (1937). In the above form it is due to 
Berzi and Gorini (1969), p. 1519. Like Lalan, they base it on the “homogeneity of space-time” (not, 
as in Einstein's text, “of space and time”). This homogeneity, i.e. the fact that the common domain 
of all Lorentz charts is an affine space acted on by the additive group of R‘, is simply taken for 
granted by these authors, as indeed it may be by someone who already knows that the Lorentz 
charts which the said domain by definition admits transform into one another by Lorentz 
coordinate transformations (in the mathematical sense). However, to assume the homogeneity of 
space-time in order to prove that Lorentz coordinate transformations (in the physical sense) are 
linear, does seem to beg the question. After all, as J. M. Lévy-Leblond (1976), p. 273, aptly remarks, 
“the homogeneity of space-time [...] does not hold in every conceivable physical theory”. 
Terletskii (1968), pp. 18 f., gives a similar proof of the linearity of the transformations, appealing to 
“the uniformity of space and time, i.e. the independence of the properties of space and time of the 
choice of the initial points of measurement (the origin of the coordinates and the origin of t for 
time measurements)”. Mittelstaedt (1976), p. 154, also invokes “the homogeneity of space and 
time”, which—as he explains—“means that no place in space and no instant of time is especially 
distinguished”. The facts adduced by Terletskii and Mittelstaedt would doubtless be highly 
relevant to the discussion of a coordinate transformation between two Lorentz charts adapted to 
one and the same inertial frame, for then indeed there would be no more than one homogeneous 
space and one homogeneous time under consideration. But where two such frames are involved we 
have to do with two Euclidean spaces and two Einstein times, and the homogeneity of each is not 
by itself sufficient to determine how each space/time pair is to be set in correspcndence with the 
other. 

9. This line of argument is developed in detail by V. I. Fock (1959), pp. 12-16, 377-84. The 
invariance of equations (3.6.3) and more generally of the Maxwell equations under the 
15-parameter Lie group generated by the transformations (3.6.5/6) was first pointed out by 
H. Bateman (1910) and E. Cunningham (1910). See however S. Lie and F. Engel, T7G, vol. II 
(published in 1893), p. 334, Theorem 24. Other methods for proving that the Lorentz coordinate 
transformations must be linear or for deriving them “without linearity assumptions” are explained 
by P. Suppes (1959), R. Weinstock (1964) and W. Benz (1967). In my opinion they are less 
significant than the two we have considered here. The proof of linearity by M. Dutta, T. K. 
Mukherjee and M. K. Sen (1970) baffles me. For, on the face of it, the authors only assume that the 
transformations under study are analytic permutations of R*. Yet surely the non-linear 
transformation fot, x-+x, yroy, 2-42 + sin ¢ satisfies this hypothesis. The proof must therefore 
rely on some additional, tacit assumption. 
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1. See the bibliographies in Arzeliés (1966) pp. 80-2, and Berzi and Gorini (1969), note 2. To 
the works listed there let me only add—besides Berzi and Gorini’s own rigorous and lucid paper-— 
G. Siissmann (1969), A. R. Lee and T. M. Kalotas (1975, 1976), and J. M. Lévy-Leblond (1976). My 
own presentation is mainly based on Ignatowsky (1910, 1911a, b), Frank and Rothe (1911, 1912), 
Pars (1921), Lalan (1937), Berzi and Gorini (1969) and Levy-Léblond (1976). 

2. See pp. 74f. I do not mean to say that we already possess the means of ascertaining how 
v determines the transformation. For although we have assumed that Tr and 7p: are unique, 
it might turn out that our assumptions, up to this point, are still insufficient to determine 
them. 

3. Asa matter of fact, lgnatowsky’s argument does not explicitly or implicitly depend on the 
premise that the transformations are linear. (I thank John Stachel for reminding me of this.) But it 
does presuppose that all points of an inertial frame move in another such frame at all times with 
the same velocity, i.e. the very assumption from which we inferred the linearity of the Lorentz 
transformations in Section 3.6 (see note 9 to the present Section). Frank and Rothe (1912) derive 
linearity from the following two hypotheses: (I) The Principle of Inertia holds good in all inertial 
frames. (This leads to (A), on p. 75, and hence to the conclusion that the transformations must be 
projectivities of the form (3.6.4).) (II) If any two particles move with the same velocity v in an 
inertial frame F, they move in any other inertial frame F’ with the same velocity v’. (This implies 
that the projectivity which transforms a coordinate system adapted to F to one adapted to F’ must 
be affine—i.e. that it must send points at infinity to points at infinity—so that its restriction to the 
finite domain is in effect a permutation of R* and therefore a linear mapping.) 

4. The speed |v| with which either frame moves in the other one must be understood to mean 
the distance traversed in the stationary frame by a point fixed in the moving frame, in unit time 
measured by clocks at rest in the stationary frame. The Reciprocity Principle yields therefore a 
prescription for the synchronization of clocks in an inertial frame, which does not rely on optical 
means. If speed were taken to mean “self-measured” speed, as defined by Bridgman (1963), p. 26, 
ie. the ratio of the distance traversed to the time measured by a clock carried by the moving object, 
the Reciprocity Principle would not say a word about the time coordinate functions attached to 
distinct inertial frames and could not therefore be used for inferring anything with regard to the 
manner how those time coordinate functions transform into each other. 

5. Thus, for example, the Relativity Principle is satisfied if the inverse of ve J is —v/(1 —v). 
Ignatowsky (191 1a), p. 5, argues for the Reciprocity Principle as follows: Let AB and A’B’ be unit 
measuring rods at rest respectively in the inertial frames F and F’. Let A’ B’ slide along AB, so that 
B' passes A at time tg, and passes B at time t, (of the unprimed coordinate system). Let 4 and ¢’', be 
the times (of the primed system) at which A passes B’ and A’, respectively. Then, according to 
Ignatowsky, the Relativity Principle demands that Ar = t, —t) = t —t) = At’. Consequently, 
the speed 1/At of F’ in F equals the speed 1/At’ of F in F’, and their respective velocities can only 
differ in sign. But Ignatowsky has obviously overestimated the strength of the Relativity 
Principle. From it you can infer that At is the same function of At’ as At’ is of At, but not that that 
function is the identity. 

6. The reader may wonder why this should depend also on the speed |p| of F’ in F, and not— 
by analogy with (b)—on the speed—call it |v’|—of F in F’. But v’, being the inverse of v in the 
velocity group, is uniquely correlated with it. And it is not being claimed that the dependence (c) 
has the same form as (5). 

7. Thecontinuity of the mapping is a truism if, as one would tend to assume, the velocity group 
is a one-parameter Lie group (or at least a one-dimensional topological group). Berzi and Gorini’s 
proof makes use of the theorem that every continuous bijective mapping of a connected set of real 
numbers onto itself is either strictly increasing or strictly decreasing. They argue thus. If the 
mapping v+ov' is strictly increasing, and v < v', v’ < (v')' = v, which is absurd. Likewise, if v’ < v, 
vu = (t') <v'. Therefore v =v’. Hence, unless the mapping in question is the identity—an 


alternative tacitly excluded by the authors—it must be strictly decreasing. Set v* = — v’. By (3.7.3) 
—(-v’') =v, whence the preceding argument can be applied to the strictly increasing function 
vrov*, to show that v* = v. Therefore, vo’ = —v. 


8. To verify this statement, pre- and postmultiply the matrix [a,;] by the matrix ofa rotation of 
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the (y, 2)}-plane by an arbitrary angle 0, viz. 


1 0 0 0 
0 1 0 0 
0 0 cos@ sin@ 
0 0 -—siné@ cosé 


and equate the results. 

9. From this point on, to eqn. (3.7.14), we closely follow Pars (1921). As I remarked in note 3, 
Ignatowsky (1911a) did not initially claim that the transformation from the unprimed to the 
primed system was linear. He did, on the other hand, argue that due to the isotropy of the planes 
perpendicular to the direction of motion, the coordinates y, y’, z and z’ can only occur in the 
transformation equations implicitly, through the distances r and r’ from any given spatial point to 
the x-axis and, respectively, the x’-axis. The transformation equations read therefore: 


t=f(t,x,rv) x’ = g(t, x,r,v) r' = h(t, x, r, v) (*) 


where f,g and A are smooth but otherwise arbitrary functions of 1,x,r and the velocity v. 
Ignatowsky invokes the Relativity Principle to conclude that 


r=fyx rie’) x=g(t',x rv) r=h(t',xr,v’) 


where v’—the velocity of F in F’—is, as we know, equal to — v (see note 5; this value for v’ is used 
by Ignatowsky (191 1a) on p. 11, in the transition from eqn. (2) to eqn. (3)). Ignatowsky’s next step is 
to take differentials of both sides of eqns. (*): 


dt’ = fog dt + fo, dx +fo2 dr 
dx’ =fyydt+fi,dx+fi,dr (**) 
dr’ =fagdtt+fy,dx+fy,dr 


where the f, are functions of t, x, rand v. He then argues that f;, = 4, = Oand that/,, = 1,so that 
dr’ = dr, whence fy, = f,, = 0. The rest of the argument is less straightforward than Pars’ proof, 
essentially because Ignatowsky does not rely on the linearity of the system (**). But the reader 
cannot fail to observe that eqns (**) follow from eqns (*) only if dv = 0, ie. if the velocity v is the 
same for every point of the relative space of the moving frame, at every instant of its relative time. 
As I argued in Section 3.6, this entails that the transformation is linear. 

10. If times and lengths are measured in units of the kind we have agreed to use (p. 56), k is of 
course a pure number. 

11. The reader may wish to complete the computation of the matrix of the transformation 
(3.7.5). We know already that, in (3.7.8), B = u, so we have only to determine £ and v. Putting 
w = ~—v in (3.7.13b) we obtain 


B(0) = (1 — kv?) (Bw)? 
Setting v = 0 in the foregoing equation we see that 
B(0) = (B(0))? = 1. 
Thus, 


kv 
v(v) = SS 


1 
J1—kv? Jl — ke? 


Hence, equations (3.7.5) finally become: 


B(v) = 


; t—kvx y 
ere yy 
Ji — kv? 
6 itl (T) 
x i 4 


Ji —kv? 
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If 0 <k = 1/c’, they are indeed the same as the pk Lorentz transformation equations (3.4.18). 
If k = 0 they reduce to a homogeneous pk Galilei transformation without rotation (cf. eqns 
(1.6.2)). 

12. If k < 0, we can choose our units so that k = —1. Then u = vaw = (v+w)(1 —vw) |. If we 
set v = w = 1/2,u = 4/3 > 1. Consequently, 1 ¢/. If we set v = w = 1, it turns out that u = 00. Ifv 
and ware finite but greater than 1, u < 0. We have indeed assumed that 1/2 € J. But such must be 
the case if I includes any positive number, no matter how small, as will be readily gathered from 
our sketch of the structure of the group (/,*) in note 13. 

13. Lalan (1937), pp. 90-3. To see why a negative value of k is irreconcilable with the 
Chronology Principle, we can make use of the above remark that, if k <0, the group (J,*) is 
homeomorphic with the projective line and hence with the quotient space formed from the circle 
S' by identifying diametrically opposite points. Let 5 stand for arc tan v. Set k = — 1. (3.7.14) can 
then be written as: 


Ee = tand+ tan wW 
(tan D)«(tan w) = ———————-- = tan (0+ W) 
1—tandtanw 


The mapping b++tan v is thus a continuous homomorphism of the 1-parameter group of 
rotations O(1) onto the group (/,*). Substituting —1 for k and tan d in eqns (T) in note 11, we 
obtain: 
t'=tcosd+xsinvd yoy 
. (T*) 
x’ = —-ftsind+xcost z'=2z 


Thus, if k < 0, the group G of transformations between matching coordinate systems adapted to 
collinearly moving inertial frames is none other than the realization of O(1) as the group of 
rotations of the (t, x)-plane. No wonder therefore that a pair of events that share the same space 
coordinates in one system may turn up in another in reversed time order. Note that if we restrict 
the rotation angle @ to (0, 2), arguing perhaps that the image of this interval by tan contains every 
number we can think of asa velocity, the set of coordinate transformations governed by equations 
(T*)—that is, by eqns (3.7.5)—is not a group. 


Section 3.8 


1. Whittaker (1951/53), vol. II, p. 40. The “amplifications” mentioned by Whittaker are 
Einstein’s formulae for stellar aberration and the Doppler effect. 

2. See, for instance, M. Born (1956), pp. 100 ff., G. Holton (1960), (1964), (1968), C. Scribner 
(1964), S. Goldberg (1967), (1969), E. Zahar (1973), T. Hirosige (1976), K. Schaffner (1976), 
A. I. Miller (1981). G. H. Keswani (1964/5, 1965/6) sides with Whittaker, though he somewhat 
tempers the latter’s views. While Keswani is generally accurate in his handling of the sources—in 
contrast with Whittaker, who sometimes uses inverted commas rather superciliously—some of his 
mathematical arguments are untenable. For instance, in (1965/6), p. 27, Keswani criticizes 
Einstein’s derivation of (3.4.5) and (3.4.10) because he “uses velocities of light propagation equal to 
(c —v), (¢ +v) and J (ce? —v?)in utter disregard of his own second postulate”. Keswani apparently 
overlooks that the second postulate—the LP—applies to velocities referred to an inertial 
coordinate system whose time coordinate function is defined in agreement with Einstein (1905d), 
§ 1, whereas the velocities (c — v), etc., objected to by Keswani are referred to the auxiliary system u, 
defined by (3.4.2). 

3. Poincaré (1906), The paper was submitted on July 23rd, 1905. A five-page abstract appeared 
in the Comptes Rendus of the Paris Academy of June Sth, 1905. Einstein (1905d) was submitted to 
Annalen der Physik on June 30th, 1905. Consequently, if their theories were indeed the same, 
Poincaré would take precedence over Einstein. Second inventors have no rights. 

4. Poincaré (1954), p. 536. The quotation is taken from a course given at the Sorbonne in 1899. 

5. In his painstaking study of Poincaré (1906), A. I. Miller writes that Poincaré “considered the 
retardation of moving clocks as experimentally unobservable. Thus, it is not a real effect; rather, 
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the hypothesis of a local time coordinate is purely mathematical” (Miller (1973), p. 243). But, as 
noted on p. 45, unless natural clocks keep Lorentz local time, the latter cannot be used for 
calculating the actual outcome of experiments. 

6. Poincaré, SH, p. 101. Cf. the following statement of the “law of relativity”, published in 
1899: “The readings we shall be able to make on our instruments at any instant will depend only on 
the readings we could have made on these same instruments at the initial instant” (Poincaré, SH, 
p. 99). 

7. Poincaré, Oeuvres, vol. IX, p. 412. 

8. Poincaré, SH, p. 182. 

9. Poincaré, VS, p. 127. A few pages later Poincaré dismisses the suggestion that the aether 
should be viewed as part of the physical systems to which the principle of relativity applies. If 
motion relative to the aether could be detected the principle ought to be pronounced false. 
However, “every attempt to measure the velocity of the Earth relative to the aether has led to 
negative results. This time experimental physics has been more faithful to the principles than 
mathematical physics; the theoreticians would have readily discarded [the Relativity Principle] in 
order to achieve consistency in their other general views; but experience has obstinately confirmed 
it” (Poincaré, VS, p. 132). 

10. Poincaré, SH, pp. 111 f. 

11. Poincaré, VS, pp. 52 f. 

12. Poincaré, SH, pp. 182 f. 

13. Lorentz (1904), in CP, vol. V, p. 174; Kilmister (1970), p. 121. 

14. Lorentz (1904), in CP, vol. V, p. 176; Kilmister (1970), p. 123. It will be seen that, if we set 

~' = fin (2.2.10)—which is indeed compatible with the conditions prescribed for both quan- 
tities—(3.8.1) turns out to be the product of the transformations (2.2.10) and (2.2.9) of Lorentz 
(1899). 

15. Lorentz (1904), in CP, vol. V, p. 188; Kilmister (1970), p. 138. 

16. Lorentz (1904), in CP, vol. V, pp. 189 f.; Kilmister (1970), p. 139. 

17. See the note added by Lorentz in 1912 to the German translation of his (1904), in Lorentz 
etal. RP, p. 10. 

18. Lorentz’s statement of his two main hypotheses may give us an inkling of his intellectual 
style: (a) Charged particles or “electrons” are spherical when at rest in the aether, and “have their 
dimensions changed by the effect of a translation, the dimensions in the direction of motion 
becoming fi times and those in perpendicular directions / times smaller.” (b) “The forces between 
uncharged particles, as well as those between such particles and electrons, are influenced by a 
translation in quite the same way as the electric forces in an electrostatic system” (Lorentz (1904), 
inCP, vol. V, pp. 182 f.; Kilmister (1970), p. 131). The effects of translation on the electric forces can 
be calculated as follows: Let L’ be an isolated material system at rest in the aether, which differs 
from the moving system & only in that the dimensions parallel to the x-axis are multiplied by £/ 
and those perpendicular to it are multiplied by /; “then we shall obtain the forces acting on the 
electrons of the moving system &, if we first determine the corresponding forces in X’, and next 
multiply their components in the direction of the axis of x by /*, and their components 
perpendicular to that axis by /?/8” (Lorentz (1904), in CP, vol. V, p. 179; Kilmister (1970), p. 127). 

19. Lorentz (1916), pp. v, 229. 

20. Lorentz (1916), p. 230; my italics. 

21. Lorentz (1916), p. 230. 

22. Lorentz (1916), p. 321. See also Lorentz, CP, VII, p. 262. 

23. Poincaré (1906), p. 129; English translation in Kilmister (1970), p. 145. 

24. Poincaré (1906), p. 129; Kilmister (1970), p. 146. 

25. Poincaré (1906), p. 130; Kilmister (1970), p. 146. 

The first such modification concerns the transformation of electric charge density; Poincaré 
proves that Lorentz’s formula cannot be correct, for his transformed density does not satisfy the 
equation of continuity (Poincaré (1906), pp. 133 f.; Kilmister (1970), pp. 152 f.). Poincaré pointed 
this out to Lorentz in a letter written in 1904, published by A. I. Miller (1980), p. 76. 

26. See, for instance, Poincaré (1906), pp. 163, 165, and especially p. 151: “The Lorentz 
transformation substitutes an ideal electron at rest for the real electron in motion.” Cf. A. I. 
Miller’s remarks on this passage in his (1973), p. 266. 

27. Poincaré (1906), p. 132; Kilmister (1970), p. 150. 
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28. Ina letter to Lorentz written late in 1904 or early in 1905 (reproduced by A. I. Miller (1980), 
pp. 79 f.), Poincaré proved that the ! factor in (3.8.1) must be equal to 1 if it is assumed that the 
transformations between the several Lorentz primed coordinate systems defined by (3.8.1) as v 
ranges over the interval (—c,c) form a one-parameter Lie group. 

29. Poincaré (1906), p. 168; Kilmister (1970), pp. 175 f. Note however then in §4 of his paper 
Poincaré designates as “the Lorentz group” the broader group generated by Lorentz transform- 
ations and dilatations. The group characterized in the text above is obtained in §4.as a subgroup of 
this extended “Lorentz group” by imposing an additional requirement to the Lorentz equations 
(3.7.1), namely, that the constant ! be a function of v. Poincaré shows that / must then be equal to 1, 
if it is assumed that the transformations satisfying (3.7.1) subject to this requirement must still 
form a group. See Poincaré (1906), p. 146; Kilmister (1970), pp. 171 f. 

30. Poincaré (1906), p. 168; Kilmister (1970), p. 176. 

31. Poincaré (1906), §9; cf. A. I. Miller (1973), p. 308. 

32. Poincaré, SH, p. 215. 

33. A. I. Miller (1973), p. 320. 

34. I was happy to learn, after writing the present Section, that S. G. Suvorov (1979), pp. 542 ff, 
had reached a similar conclusion with regard to the source of Poincaré’s shortcomings. In my view, 
however, Poincaré’s conventionalism was not so radical as Suvorov depicts it; see Torretti (19784), 
pp. 320 ff. See below, p. 350, Addenda to this note. 

35. In a lecture given in London on May 4, 1912, Poincaré spoke of “the new conception of 
space and time”, resulting from “Lorentz’s Principle of Relativity”, according to which “space and 
time are no longer two separate entities [ .. . ] but two parts of the same whole, which are so 
intimately bound together, that they cannot be easily separated”. He describes it as “a new 
convention” that some physicists wish to adopt. “Not that they are constrained to do so; they feel 
that this new convention is more comfortable, that’s all; and those who do not share their opinion 
may legitimately retain the old one, to avoid disturbing their ancient habits. Between ourselves, let 
me say that I feel they will continue to do so for a long time still” (Poincaré, DP, p. 109). 

36. C. Seelig (1954), p. 134. 

37. For instance, in two lectures of 1912; Poincaré, DP, pp. 82, 108. On Poincaré’s silence, see 
S. Goldberg (1970). 


Chapter 4 


Section 4.1 


1. See D. Hilbert (1910). 

2. L. Pyenson (1977) and P. L. Galison (1979) study Minkowski’s contribution to Relativity. 
Both articles contain much useful historical information, but tend on the whole to underrate the 
physical significance of Minkowski’s mathematical and philosophical insight. For a fairer 
judgment of the importance of Minkowski’s work and valuable references to its early reception, 
see T. Hirosige (1968), pp. 46 ff. ] thank Kenneth Schaffner for calling my attention to this paper by 
the great Japanese scholar. 

3. This seminar is the subject of an interesting article by L. Pyenson (1979). 

4. Minkowski (1915). 

5. Minkowski apparently expected that some sort of “electromagnetic world-view” could be 
built on the Relativity Principle. Cf. Minkowski (1909), p. 111, last paragraph. 

6. Minkowski (1915), p. 372. 

7. Minkowski (1910), submitted to the Kgl. Gesellschaft der Wissenschaften zu Gottingen on 
21 December 1907; first printed in its Nachrichten in 1908. Our references are to the reprint in 
Mathematische Annalen. An appendix to this paper sketches a four-dimensional formulation of 
relativistic mechanics. 

8. Minkowski (1909); English translation in Lorentz et al., PR. The reader is advised to peruse 
this lecture now, if he has never read it. 

9. Minkowski (1915), p. 373. 

10. Minkowski, who was still under the influence of the Lie—Klein conception of geometrical 
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space as a “manifold of numbers” (Zahlenmannigfaltigkeit), equates “a worldpoint™ with “a point 
of space at a point of time, i.e.,a system of values x, y, z, £” (Minkowski (1909), p. 104; cf. his (1910), 
p. 475, last two lines). It might seem that “worldpoint” is only a fancy name for a number 
quadruple and that Minkowski’s “world” is none other than R*. But a few lines later he adds: “In 
order that there is nowhere a gaping void, we shall assume that something perceivable exists at 
each place at all times. To avoid saying matter or electricity, we designate this something by the 
word substance. Let us turn our attention to the substantial point that is present at the worldpoint 
x, y, z, t.” Since no physical, perceivable, substance can be present at a number quadruple, we must 
conclude that Minkowski speaks ambiguously. I resolve the ambiguity by distinguishing between 
a physical manifold of “worldpoints”, or “world”, and the “number manifold” R*, used for 
charting the former. I have discussed the concept of a Zahlenmannigfaltigkeit in Torretti (1978a), 
pp. 137 f., 173, 393 n. 42, 399 n. 81. 

11. The concept of a quadratic form is defined below, in note 1 (b) to Section 4.2. 

12. See above, p. 64. 

13. Poincaré was the first to note this; see above, p. 86. It can be proved by deriving the Lorentz 
transformation from the assumed invariance of (4.1.1); see, for instance, Einstein (1956), pp. 32 ff. 

14. One may always choose r so that its Euclidean length |r| = 0, and its sense is such that r 
represents, according to the accepted conventions, the rotation induced by p on each of the said 
hyperplanes. The three components r? suffice then to characterize p. The matrix (a,,) is expressed 
as a function of r by Sudarshan and Mukunda (1974), p. 238, eqn. (11). 

15. Let x and y be two Lorentz charts and denote y~ '+ x(P) by P’. Since f, is invariant under the 
coordinate transformation y:x~', F(P’,Q’) =f,(y(P’), ¥(Q’)) =fa(x(P), x(Q)) = F(P, Q). 


Section 4.2 


1. The following mathematical concepts will be essential in the sequel. 

(a) Let V be a real vector space. An inner product on V is a symmetric bilinear function 
f:Vx VR, such that f(v, w) = 0 for every we V only if v = 0. f is positive definite if 
S(v, v) > 0 for every ve V unless v = 0. It is negative definite if f(v, v) < 0 for every vev 
unless v = 0. If fis neither positive nor negative definite it is said to be indefinite. The 
structure <(V,f> is called an inner product space. Let V be n-dimensional. A basis 
(€;,..  &,) of V is said to be orthonormal iff (e;, e;) = +4), Le. + lifi = jand Oifi 4 j(I 
< i,j <n). Every finite-dimensional inner product space has an orthonormal basis. Let 
(e,,.. ., &,) be an orthonormal basis of ¢ V, f >. The trace of the matrix [f(e;, e;) ]—i.e. the 
sum £,f(e;, e;) of its diagonal elements—is called the signature of the inner product /. 
(Note that it does not depend on the choice of a particular orthonormal basis.) Evidently 
the signature of fis equal to n if fis positive definite, to — nif fis negative definite, and to 
some integer less than n but greater than —n if f is indefinite. 

(b) Let V be a finite dimensional vector space over a field F of characteristic # 2. A quadratic 
formon Visa function q: V — F such that (i) q(v) = q(—v) for all v € V,and (ii) the function 
hi Vx VF given by A(v, w) = (1/2) (q(v + w) — g(v) — q(w)) is bilinear. h is the bilinear 
function obtained by polarizing the quadratic form q. 

(c) If S is a set and d is a real valued function on S$ x S, <S,d) is a metric space if for any 
elements a, b, c € S(i) d(a, a) = 0, (ti) d(a,b) > 0 unless a = b, and (iii) d(a, b) + d(b, c) 
> d(a,c). A function d meeting these requirements is called a distance function on S. The 
reader should satisfy himself that if < V, f > is an inner product space with positive definite 
inner product f and we set d(v,w) =/(v —w, v—w) for every v,we Vd is a distance 
function on V. A metric space <S,d > is made into a topological space by the following 
stipulations: For each ae S,r é R, we define the open ball B(a, r) to be the set {xe S | d(a, x) 
<rt;a subset of S is open if and only if it is empty or it is a union of open balls. The 
topology thus defined is said to be induced by the metric. 

2. The null cone in R‘ is the set of real number quadruples { (k, kr,, kr2, kr3)|k eR; rtr? = 1}. 

3. To ensure that the space coordinate functions of the said chart form a right-handed system, 

as prescribed on page 66, we must impose an additional physical condition on H or replace (c) by 
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the following: (c’) If <t, r> = 1 and R, = u,P = w,Q, where (w,,Wg> = tag, and Cu,, u,> 
= <r, w,> = 0 (a, B = 1, 2, 3), there could exist an unstressed right-handed inertial tetrahedron 
OX ,X,X, of the kind described on page 14, such that P and Q occur at O and R, at X,. 

4. By (4.2.3), if yg, pEeLo, A, (Wt), P)j= H (Wu), P)= H(p-w(v), P). Consequently, 
Lip, L(g, H)) = LQW, Hy) = Lig: w, H). Note that L is a “right action”, in the sense defined on 
p. 288, note 13. 

5. This unfortunate name stems from a particular application of differential geometry to 
mechanics. To avoid confusion with the physical concept of energy I write “energy” in quotes when 
it stands for the geometrical concept. 

6. The concept of a critical point of E can be summarily explained as follows: Let K be the set 
of all parametrized curves o:[p,q]—S whose endpoints are }(p) and +(q). The mapping 
a +E(o) is a real valued function on K. Denote [p, q] by /,, and let I, be a closed interval 
containing 0. A family (¢,,) <¢ K, indexed by /, is said to be a variation of y if y = a9 and there is a 
smooth mapping f of I, x /, into S such that 0, (1) = f(u, t) for each ue], andeachtel,.yisa 
critical point of the function E if, for every variation (¢,) of y, the derivative dE(a,)/du is equal to 0 
atu=0. 

7, Customarily a curve y in a proper Riemannian manifold ¢S, > is said to be a “geodesic” if it 
is length-critical on every closed subinterval of its domain. If 7 is geodesic in this sense there exists 
always a reparametrization of y which is a Riemannian geodesic in the sense defined above (and 
hence also a geodesic of the Levi-Civita connection determined by #—see Appendix, pp. 277, 279). 

8. See the definition of a metric space in note 1. 

9. To show this one need only solve the geodesic equations (D.8) on p. 281 in the particular 
case in which x is a Lorentz chart, so that the g,, = 9,, and the I'}, are all 0. In this case, the right 
hand side reduces to 0. Hence, if isa Riemannian geodesic of <.4, 9), x yisa straight line in R*, 
whose image by x~' must also be straight in the affine space <.4, R4, Hy, 

10. Minkowski (1909), p. 108. 

It. W. Rindler, (1977), p. 43. 

12. The length of OB can be easily calculated from our data. Let @ be the angle between 
wand the x°-axis (or between the x!-axis and v). Obviously, tan 0 = v (the speed of Y in X). Set 
AOAB=.w. Clearly @+@/2=2/4. Hence sinw=cos26. Since AOBA = 0472/2, 
OB/OA = sinw/cos @ = cos 20/cos@. Noting that the perpendicular projections of OA on 
the x°-axis and on the’ x!-axis are, respectively, OAsin® and OAcos@, we infer from the 
hyperbola equation (x')?—(x°)?=1 that (OA)? = (cos?@—sin?0)~! = 1/cos26. Hence 

= (cos 20/cos? 6)'/? = /(1—-tan? 0)= /(1—v7), in agreement with the length 
contraction formula (3.5.1). The length of OS can be calculated similarly. OS = OBcos 6, But 
(OB)? = (OA)? = (cos? 9 —sin? 6)" '. Hence, OS = (cos 20/cos? 0)" ''? = 1/,/(1 — v2), in agree- 
ment with (3.5.2). 


Section 4.3 


1. Fibre bundles are defined in the Appendix, pp. 263 f. However, it is not necessary to master 
their definition in order to understand the present section. See note 3. 

2. A (q,r)-tensor at P is said to be g times contravariant and r times covariant. It is not, 
however, the only kind of mixed tensor at P that can be so described, It differs, for instance, froma 
(q+r)-linear function on (Sp) x (S})%, or one on (Sp) x (Sp) x (Sp)47! x (Sp)"~1, etc. To avoid 
cumbersome explanations, we shall consider only (q, r)-tensors in the sense defined above. The 
reader can easily adapt all definitions to the other kinds of mixed tensors one might conceive. 

3. The tangent bundle of S is defined on pp. 265. Higher order tensor bundles can be 
defined analogously. But the crucial notion of a (q, r)-tensor field on S can also be defined without 
resorting to the concept of a fibre bundle. Let x be a chart defined at Pe S. Let e, and e* stand, 
respectively, for the vector 0/éx' ‘lp and its dual dx‘(P). (e,,.. .,,) isa basis of Sp and (ef, .. ., ef) 
is the dual basis of S} (pp. 257-50): Hence, a (q, r)-tensor t at Pi is fully determined by the n?*" "real 
numbers t;! y= ter, CPs eis +) IS ink Sal shsgq 1<k <r). These real 
numbers are falieg the components oft relative to the chart x. Let T assign to each Pe S a (q, r)- 
tensor Tpat P. Then, if U c Sis the domain of a chart x, T assigns to each P € U the array of Tp's 
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components relative to x. Thus, T defines on U an array of n**’ real-valued functions, which we 
may call the components of T relative to x. T is a (q, r)-tensor field if every component of T relative 
to each chart of S is smooth. 

4. The module structure of ¥'(S) is defined in the Appendix, p. 260. The definition is 
readily extended to ¥ 3(S): bear in mind that the values of all elements of ¥#(S) at a given point 
PS belong to the same vector space attached to that point, formed by all pairs (P, Tp), where Tp 
is a (q, r)-tensor at P. 

5. Principal fibre bundles are defined on p. 263. 

6. The following remarks on components will be useful later. In this note, small Latin indices 
range from 1 ton. If the chart x is defined on U, < S, the n-ad field (6/éx') is a basis of W'(U,). If 
dx/ stands, as on p. 260 for the differential of the coordinate function x/, dx/(d/dx') = dx//dx' 
= 6,,. Hence, the co-n-ad field (dx') is the basis of ¥, (U,) dual to (6/€x'), If V is a vector field on S, 
its components V' relative to the chart x are given by: V' = V(dx') = dx‘(V) = Vx'. On U,, V is 
equal to a linear combination L‘d/dx'. We see at once that L'= L/6,,= Lidx‘(d/dx/) 
= dx'(L/0/0x/) = dx'(V) = V" For further details on the components of the covariant, 
contravariant and mixed forms of a given tensor field, relative to a particular chart, and on the 
transformations rules that link the arrays of such components, relative to different charts, the 
reader will do well to consult a good textbook on the tensor calculus, such as Lichnerowicz (1962), 
Synge and Schild, TC, or Lovelock and Rund (1975). 

7. It will be noticed that 1 am using the same letter T to designate three different things, 
namely, a mapping of the manifold S on its (q, r)-tensor bundle, a (q¢ +1r)-linear mapping of a 
Cartesian product constructed from the modules of vector and covector fields on U c S into the 
ring of scalar fields on U, and a mapping of the set of n-ad fields on U into a Cartesian product 
constructed from the said ring of scalar fields. But the first two of these three things are uniquely 
associated to one another by an isomorphism of the linear spaces to which each belongs (see, for 
instance, Abraham and Marsden (1978), p. 82, Theorem 2.2.8), while the third is unambiguously 
determined by the second. It is therefore more sensible to denote all three by the same letter than to 
burden our memories with three different symbols. It is also worth noting that, if two (q, r)-tensor 
fields T and T’ have the same array of components relative to a particular n-ad field X on S, T and 
T’ are equal. For the difference between the values of T and T’ at any suitable list of elements of X * 
and X is the value of the tensor field (T — T’) at that list. Now, if (T — T’) equals 0 at each relevant 
list, (T — T’) is the zero vector in the module of (q, r)-tensor fields on S, and T = T’. Evidently, in 
this case T and T’ have the same array of components relative to each n-ad field on S (or part of S). 
A similar argument can be invoked to establish the symmetry (skew-symmetry) of a tensor field of 
order 2 if the matrix of its components relative to a particular n-ad field on its domain is symmetric 
(skew-symmetric). 

8. See Appendix, pp. 261 f. 

9. See Appendix, pp. 267 f. 

10. These concepts are explained in the Appendix, pp. 274 f. 

11. See Appendix, Section C. 8, pp. 279 f. 

12. Paraphrased from Minkowski (1910), p. 484. Minkowski actually uses here the spacetime 
chart with imaginary time coordinates introduced by Poincaré (see p. 86). Hence, for him ro 
(which he calls r,) is not a real but an imaginary number. 

13. Compare this simple definition with the formidable construction in Minkowski (1910), 
pp. 484 f. 

14. Let y be another Lorentz chart adapted to F. V = Vy'd/dx'. As y® = x®, Vy® = Vx° = V°, 
and Vy*0/édy* = V—V°d/dx® = V. 

15. Let V|Fp be the restriction of V to Fp, and let X be, as on p. 56, the Cartesian chart on S; 
associated to x. The Cartesian vector field in question, which we may denote by V, is defined by the 
equations: dx*(V) = dx*(V| Fp). 

16. y,, abbreviates y, ,(0/dt|,); see p. 261. Hence, at y(z), by (A.7), 


Ui Uyi (; Jy dy’ 
a a)? dt 


17. Note that E(U) = 1 only ifc = 1; otherwise, E(U) = c?. Our choice of units normalizes the 
4-velocities of all material particles to unit length. The argument leading to (4.3.5) has been 


y(t) 
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simplified to avoid distracting the reader with too many symbols. Strictly speaking, the 
components of u, for each time t, are, according to (4.3.2), u*- 7 = d(y*-})/dt, while those of U, for 
each proper time t, according to (4.3.3), are U'- y = d(y'-y)/dt. If t is regarded as a function of t 
(Le., if ¢ stands, as, in de/dzt, for the mapping tro), we can readily verify that p*-y-t = y*-y. 
Hence U*+y = d(y%: y)/dt = (d(y*: ¥)/dt) (de/dt) = B,.,.(u*- 7). 

18. This definition, of course, is admissible only if A is in fact a 4-vector. The easiest way of 
proving it is to verify that its components obey the appropriate transformation rule. If we denote 
the components of U relative to y by Uj,,, and x is another Lorentz chart, it is clear 


that Uj,, = (6x'/dy/)Ui,,, for U is, by definition, a 4-vector. Hence Aj,, = dU{,,/dt 
= d/dt(Uj,,0x'/dy’) = (dx'/dy’)Aj,). Observe that the last equation results from the fact that 
if x-y~' is a Lorentz transformation, the dx‘/dy/ are constant functions. 

19. See, for instance, F. Kottler (1912); W. Pauli (1958), pp. 156 f.; V. Fock (1959), Chapter I'V, 
etc. 

20. See, for instance, H. Heintzmann and P. Mittelstaedt (1968). Reading this article, especially 
pp. 216 ff, 223 ff., it seemed to me that the authors assume that the components of a geometrical 
object relative to the frame on which they are being observed ought to be measured always by the 
same experimental devices, regardless of the type of frame involved. Yet nobody would wish to 
measure time ona spaceship with pendulum clocks, no matter how accurate they have proved to be 
in terrestrial labs. When extending 4-tensors to ordinary tensors, one should bear in mind that 
inertial and non-inertial frames are not physically equivalent, even if the tensor calculus provides 
the means for dealing with them mathematically on the same footing. 


Section 4.4 


1. The early development of Special Relativity is studied in detail by A. I. Miller (1981). 

2. In our units (c = 1). Except for this and for the change in notation, eqns (4.4.4) are the same 
as the “Maxwell—Hertz equations” given in Einstein (1905d), p. 907. In the standard vector 
notation they read thus: 


@E/éx°=VxH  —CH/éx° = VxE. 


3. Equations (4.4.5) can be derived from (4.4.4) by subjecting the differential operators in the 
latter to the Lorentz transformation (3.4.18). See, for instance, L. Silberstein (1924), pp. 225 f. Of 
course, E’ and H’ are divergence-free, like E and H. 

4. Cf. for instance, Lorentz (1892a), §42 (CP, II, p. 198); (1895), §§9, 12 (CP, V, pp. 18, 21); 
Einstein (1905d), p. 909. 

5. Einstein (1905d), pp. 909 f. 

6. Cf. the discussion of time-dependent vector fields on p. 103. 

7. More precisely: to their associated Cartesian charts. 

8. “Ifa unit electric point charge is in motion in an electromagnetic field, the force acting upon 
it is equal to the electric force which is present at the locality of the charge, and which we ascertain 
by transformation of the field to a system of coordinates at rest relatively to the electrical charge” 
(Einstein (1905d), p. 910; Kilmister (1970), p. 206). 

9. Einstein (1905d), p. 909, says that this is a consequence of the Relativity Principle, but 
evidently the RP alone yields no conclusions as to the invariance of charge. 

10. “These results as to the mass are also valid for all ponderable material points, because a 
ponderable material point can be made into an electron (in our sense of the word) by the addition 
of an arbitrarily small electric charge” (Einstein, (1905d), p. 919; Kilmister (1970), p. 216). 

11. J. Stachel and R. Torretti (1982) discuss the derivation of the mass-energy equation in 
Einstein (1905e). We there dissect and dismiss the allegation by H. I. Ives (1952) that Einstein's 
reasoning is fallacious. Other derivations of the mass-energy relation were given by Einstein 
(1906), (1907a), (1907), pp. 440 ff. and by Planck (1907); cf. also Einstein (1935). 

12. Einstein (1905e), p. 641. 

13. M. Bunge (1967) argues that the “numerical equivalence” of the (proper or relativistic) mass 
and the energy of a body does not imply that they are the same thing, for “numerical equalities do 
not entail the identity of the predicates involved” (p. 202). On the other hand, Einstein believed 
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that “science is fully justified in assigning ...a numerical equality only after this numerical 
equality is reduced to an equality of the real nature of the two concepts” under consideration. 
(Einstein (1956), pp. 56f.) Be that as it may, if a kitchen refrigerator can extract mass from a jug of 
water and transfer it by heat radiation or convection to the kitchen wall behind it, a trenchant 
metaphysical distinction between the mass and the energy of matter does seem far-fetched. Of 
course, if lengths and times are measured with different, unrelated units, the “mass” mf, 
measured, say, in grams, differs conceptually from the “energy” m, Bc”, measured in ergs. But this 
difference can be understood as a consequence of the convenient but deceitful act of the mind by 
which we abstract time and space from nature. Equation (4.4.13) and the equivalence of mass and 
energy can be vindicated by purely mechanical arguments, without appealing to the laws of 
electrodynamics; see, in particular, the beautiful paper by R. Penrose and W. Rindler (1965). 

14. Even E. G. Cullwick (1981), who sees in the 1905 definition of force the root of some alleged 
“inconsistencies” in Einstein’s electrodynamics, describes the definition itself as consistent but 
incorrect (loc. cit., p. 173). 

15. See, for instance, R. C. Tolman (1934), pp. 43 ff.; W. Rindler (1960), pp. 77 ff.; A. P. French 
(1968), pp. 167 ff. 

16. Page 110. Lorentz (1904) had given the now accepted value of the transverse mass (Lorentz, 
CP, V, p. 185, eqn. (30)). The discrepancy with Einstein’s value suggests indeed that in 1905 
Einstein had not yet read Lorentz’s paper, but does not prove it conclusively. Einstein’s value 
follows from his carefully contrived definition of force, and he expressly warns us that a different 
definition would yield another value. 


Section 4.5 


1. The following sketch cannot claim to be much more than an aid to memory, though I hope it 
will give the reader who first meets the subject a feeling of its drift and will move him to seek better 
instruction elsewhere. Besides the introductory monograph by W. Rindler (1960), the en- 
cyclopedia article by P. G. Bergmann (1962a) and the classical treatise by J. L. Synge (1955), 
I recommend the lucid and profound book by W. G. Dixon (1978), which clarifies and refines, 
particularly in Chapter 3, several key points that are shoddily dealt with here or in the foregoing 
Section. 

2. Minkowski (1910), p. 521, eqn. (22). Minkowski places the proper mass outside the scope of 
the differential operator, because in the present context he explicitly assumes that it is constant. In 
that case, as he correctly points out, the 4-force K is orthogonal to the 4-velocity U. But he was well 
aware that it was a special case, as one may gather from a remark he makes in another context, to 
the effect that “the quantity that deserves to be called the body’s mass . . . changes . . . with each 
heat transfer” (Minkowski (1915), p. 381). 

3. The components of the (1, 1) mixed form of F are F, = 7 Fs; those of its covariant form are 
Fi; = na F¥. It follows that the F/are identical with the Fu, except that Fj = —F*° = E,, while the 
F,, are identical with the Fi, except that Fp, = —F° = —E,. 

4. Use the transformation rule F’/ = (dy! i/ax")(dyi/ax*) F™, and (3.4,18’). 

5. The system on the left epitomizes the Maxwell equations usually given as: 


Vx E= —0H/éx° V-H=0. 
The system on the right epitomizes the Maxwell equations: 
V x H = 4nJ + 0E/0x° V-E=4np. 


A more elegant formulation is achieved by introducing the electrostatic potential g and the vector 
potential A, such that (i) H = Vx A, (ii) E = —0A/ét — Vo, and (iii) V- A + 0g/dt = 0. It can be 
shown that g and A are the timelike and spacelike parts, relative to their frame of Teerenee, ofa 
Ps vector, the electromagnetic 4-potential ®. Let x be the Lorentz chart such that ¢ = x° and 

= (@/dx*) in (i)}(iii). Relative to x, ®° = g and ®* = A,. We write ®'; for 6@'/x/. The gauge 
aaa (iti) reads then: }', = 0. We define . skew-symmetric covariant tensor field, com- 
ponentwise, by the equations (iv) F;; = ®, j (where the ®; are the components of the 
covariant form of ®). It follows from. {i), (ii) ey hote 3, that the E, are the components of the 
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covariant form of the electromagnetic field tensor F, whose contravariant form was given by 
(4.5.4). Equations (iv) guarantee that F is indeed a 4-tensor field. If 10 symbolizes the 
D’Alembertian operator nye? Ox! ex), classical electrodynamics is summarized by the following 
system of equations, of which the Maxwell equations (4.5.5) are mere corollaries: 


Oo = —4r)', 


6. Maxwell, TEM, §§ 641 ff. 

7. See, for instance, P. G. Bergmann (1942), p. 131. Our (4.5.11) is Bergmann’s (8.27a) with 
different units and notation. Bergmann’s general solution (8.30) differs from the particular 
solution (4.5.12) above, by an arbitrary constant 4-tensor field. 

8. That the S“ are indeed the components of a 4-tensor field follows from well-known rules of 
tensor algebra. See, for instance, Lichnerowicz (1962), § 39. 

9. H. Minkowski (1908), §13; M. Abraham (1909a, 1910). Max von Laue (1911), p. 528 refers 
to the second article by Abraham and to Sommerfeld (1910). 

10. M. v. Laue (1911), p. 529. 

11. The existence of such a tensor had been anticipated by M. Abraham (1909b), p. 739. 

12. For a simple proof, see W. Rindler (1960), pp. 153 f. 

13. For a proof, see, for instance, R. C. Tolman (1934), pp. 60-2. 

14, The energy tensor can also be defined for a discrete material system, using Dirac’s delta 
function. See, for instance, S. Weinberg (1972), p. 43. 

15. Noting that pu stands for momentum density in the equations of motion (4.5.13a), and for 
density of energy flux in the equation of energy (4.5.13b), we see that the symmetry of M implies the 
equality of these two, and thus provides an additional argument for the equivalence of mass and 
energy, as Laue (1911), p. 530, points out. Our notation, indeed, takes the said equivalence for 
granted. 

16. Through Lovelock’s theorem. See p. 160, and Lovelock (1972). 


Section 4.6 


1. Robb (1921) is an abridged exposition of the theory, which lists and thoroughly motivates 
its 21 axioms and states the main theorems without proof. Robb (1936) is the revised second 
edition of Robb (1914). 

2. Mehlberg (1966) includes an improved and highly compressed formulation of the system (in 
English). R. W. Latzer (1972) and J. W. Schutz (1973) axiomatize Minkowski spacetime geometry 
using worldpoint (“point”, “instant”) and horismotical connectibility (“a-relation”, “signal re- 
lation”) as their sole primitives. 

3. Reichenbach (1958), p. 268. 

4. Conditions (i}-(vii) are equivalent to Kronheimer and Penrose’s axioms for causal spaces. 
The irreflexivity of <, which they postulate instead of our (iv), follows immediately from (iii), (iv) 
and (vii). 

5. Strictly speaking, the Alexandrov topology of a causal space S is the topology generated by 
the chronological pasts and futures of the points of S (i.e. the weakest topology in which both 
I~ (P)and I* (P)are open, for every Pe S). The chronological neighbourhoods of the points of S 
constitute a base of this topology if and only if (i) for every Pe S there are points Q, R € S such that 
Q<P<R, and (ii) for every P,Q,,Q,,R,,R,¢S there are points X,YeS such that 
O,«X <P <«<Y <R,; (i = 1, 2). These conditions are met in all cases that interest us, but a causal 
space consisting only of the three points P, Q, and R, such that P < Q < R, does not meet them. 
Here the Alexandrov topology is the discrete topology, and there is but one chronological 
neighbourhood, namely 1*(P) 7/7 (R) = {Q}, which is not a base for a topology. Another 
topology that can be naturally introduced in a causal space is the weakest topology in which all 
causal pasts and futures are closed sets. 

6. We say that a curve y:/ — S enters n times into an open subset U c S if there are n real 
numbers t;€/(1 < i <n) such that y(t;) lies on the boundary of U and 1; is the g.l.b. of an open 
interval which y maps into U. 

7. In Robb’s system, such an assumption is only warranted by the entire set of axioms. The last 
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of them, Postulate XXI, which bestows on null straights the linear order of R and plays a role 
comparable to that of Hilbert’s Axiom of Completeness (page 285, note 6), is apparently 
motivated less by the observation of optical phenomena than by a desire to secure the “right” 
geometrical structure. 

8. Proof: Let y be a Lorentz chart of Minkowski spacetime. If x is a global chart of that 
satisfies the stated requirement, y~ '- x is an isomorphism of causal spaces. On the other hand, if 
there is a causal isomorphism / of the given structure onto Minkowski spacetime, y-fis a chart of 
M which satisfies the stated requirement. 

9. Winnie’s definition of congruence omits condition (ii). Hence, two non-causal vectors which 
together span a subspace of .@ p containing a chronological vector cannot be congruent. Not only 
is this consequence undesirable; it also invalidates the proof of Winnie’s Proposition 6.6. See 
Winnie (1977), pp. 175 f. 

10. See, for instance, Robb (1914), pp. 36, 47. Acceleration planes are called “inertia planes” in 
Robb (1936). Latzer offers another, very attractive, characterization of spacelike straights: Three 
distinct, pairwise separate points lie on the same straight generated by a spacelike vector if and 
only if no point of M is horismotically connectible to all three points (Latzer (1972), Def. 2.1 and 
Prop. 2.11, pp. 245s., 250s). Winnie characterizes timelike straights, somewhat more intricately, as 
follows: Let E(A) the set of all points horismotically connectible with Aé.#: if A < B, the 
intersection E(A) ~ E(B) will be homeomorphic to a sphere; let S(A, B) denote the union of all 
spacelike straights that join any two points in E(A) om E(B); three points A, B,C €.# such that 
A < B < Clie ona straight generated by a timelike vector if and only if S(A, B) 7 S(B, C)is empty. 
(Winnie (1977), pp. 160 ff., in particular, Prop. 5.4.) 

11. A. D. Alexandrov and V. V. Ovchinnikova, “Notes on the Foundations of Relativity”, 
Vestnik Leningrad. Univ., 11 (1953) 95 (in Russian). I have not seen this paper. I owe the reference 
to A. D. Alexandrov (1967). 

12. Zeeman (1964) uses < for our < and “causal” for our “chronological”. In view of Lemma I, 
these differences are innoctious. 

13. It follows that any géG must be an orthochronous similarity. For, as a causal 
automorphism, g maps null cones onto null cones; g is therefore a conformal linear mapping, i.e. a 
similarity. Since g preserves causal precedence, it must be orthochronous. See Zeeman (1964), 
p. 493. 

14. Zeeman (1964), p. 491. 

15. If Tis a topology ona set S, the topology induced by Ton U c Sis the weakest topology 
relative to which the canonical inclusion of U into S$ is a continuous mapping. 

16. Zeeman (1967), p. 162, counters this criticism with the reminder “that we are dealing with 
special relativity"—a baffling remark, for some of the best examples of the applicability of this 
theory are furnished by particle accelerators. 

17, Both the Zeeman and the path topology are Hausdorff, connected and locally connected, 
and neither normal nor locally compact. The path topology is, and the Zeeman topology is not, 
paracompact. Hawking, King and McCarthy (1976), p. 178, use this property for proving that the 
continuous curves of the path topology are precisely the “Feynman paths”, i.e. curves continuous 
in the manifold topology, whose ranges are constrained to lie within the local null cones but may 
zigzag with respect to time orientation like the worldline of an accelerated electron in Feynman's 
quantum electrodynamics (Theorem 2). 


Chapter 5 


Section 5.1 


1. Ashort non-mathematical discussion of the unsolved problems of Newtonian gravitational 
theory circa 1900 will be found in Whittaker (1951/53), II, pp. 144-9. 

2. Einstein (1913a), p. 1249. 

3. Einstein (1913a), pp. 1249 f. Cf. M. Abraham (1912/), p. 793. 

4. Lorentz (1900), in Lorentz, CP, vol. 5, pp. 206, 214. 

5. Lorentz (1900); CP, vol. 5, p. 211. 
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6. Laplace had shown that if gravitation propagates with finite speed V and, as he implicitly 
assumed, terms of the first order in v/V do not cancel out, experience requires that V be of the order 
of 10° times the vacuum speed of light. 

7. Other, deeper difficulties inherent in Lorentz’s theory of gravity are noted by P. Havas 
(1979), pp. 83 f. 

8. Poincaré (1906), pp. 166-75. A generalization of Poincaré’s Law of Gravity was proposed 
by W. de Sitter (1911). Whitrow and Morduch (1965), pp. 20 f., 24, 29 £., 45 ff, 50, 61, have worked 
out the predictions of the Poincaré--de Sitter theory for the classical tests of General Relativity. It 
does not fare so well as the latter when compared with experience (see the comparative tables, ibid., 
pp. 63 f.). On Poincaré’s theory of gravitation see also A. L. Harvey (1965), pp. 452-4. 

9. Poincaré (1906), p. 166. 

10. Poincaré (1906), p. 167. 

11. Poincaré (1906), p. 167. 

12. Referring to the calculations of Laplace mentioned in note 6, Poincaré observes that his 
illustrious countryman “had examined the hypothesis of the finite propagation speed ceteris non 
mutatis; but here, on the contrary, this hypothesis is complicated with many others, and there could 
well be a more or less perfect compensation between them” (Poincaré (1906), p. 170). 

13. Poincaré, of course, writes this and all other equations componentwise, but the coordinate- 
free language wonderfully fits his thinking. 

14. Hermann Minkowski also tried his hand at a relativistic theory of gravity. See Minkowski 
(1908), Anhang (reprinted in his (1910), pp.. 522 ff.), and the comments by L. Pyenson (1977), 
pp. 80-4. 


Section 5.2 


1. Planck (1907), p. 542. 

2. In the German original the last sentence reads: “Wenn diese Frage zu verneinen ist, was 
wohl was Nachstliegende sein dirfte, so ist damit offenbar die durch alle bisherige Erfahrungen 
bestatigte und allgemein angenommene Identitat von trager und ponderabler Masse aufgehoben” 
(Planck (1907), p. 544). In his eagerness to detract from Einstein's originality, Whittaker quotes the 
above passage as saying the opposite of what we have just read: “Now, said Planck, all energy has 
inertial properties, and therefore all energy must gravitate” (Whittaker (1951/53), I], p. 152; 
Whittaker makes reference to “Berl. Sitz., 13 June 1907, p. $42, specially at p. 544”). 

3. Einstein (19075), p. 442. See above, p. 112. 

4. Einstein (19075), p. 443. 

5. It had been demonstrated to one part in a thousand by Newton (p. 32) and to a much 
higher precision by Edtvés (1889), and has subsequently been confirmed to 5 parts in 10° by 
Eétvés, Pekar and Fekete (1922—originally submitted in 1909), to 1:10'! by Roll, Krotkov and 
Dicke (1964), and to an astounding 1:10’? by Braginsky and Panov (1972). Ohanian (1976), pp. 
18 ff. gives a schematic description of these experiments. See also the important methodological 
remarks of J. Ehlers (1973), p. 7. 

6. Einstein (19075), p. 454. Einstein does not explain any further what, exactly, is a uniformly 
accelerated frame in the context of Relativity Theory, but in later presentations of the Equivalence 
Principle he makes his meaning more precise. In his (1911), he specifies that the frame he has in 
mind must lie “in a space free of gravitational fields” (Einstein (1911), p. 899). Einstein (1912a), 
p. 356, demands that the frame be uniformly accelerated “in Born’s sense”, i.e. that the acceleration 
of each one of its points relative to its respective momentary inertial rest frame be constant. 

7. R. H. Dicke (1959), p. 623; (1964a), p. 13. 

8. Einstein’s remark ((1913a), p. 1255) that “Edtvés'’s experiment plays in this connection a 
role similar to that of Michelson’s experiment in relation to the problem of the physical 
detectability of uniform motion”, is purely rhetorical. 

9. Einstein (19075), pp. 461, 459. The matter is discussed again with great clarity and 
forcefulness in Einstein (191 1a), §§3, 4, which is readily available in English translation in Kilmister 
(1973). 

10, “{Ich ging] von der nachstliegenden Auffassung aus, dass die Aequivalenz von trager und 
schwerer Masse dadurch auf einer Wesensgleichheit dieser beiden elementaren Qualitaten der 
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Materie bzw. der Energie zuriickzufiihren sei, dass das statische Gravitationsfeld als physikalisch 
wesensgleich mit einer Beschleunigung des Bezugsystems aufgefasst wird” (Einstein (1912c), 
p. 1063; cf. Einstein (1911a), pp. 899, 903; Kilmister (1973), pp. 130, 134). 

11. Einstein (1912c), p. 1063. 

12. Let h be the distance between the observer and the source. The redshift can be 
calculated in 2, using the relevant relativistic formula for the Doppler effect: 


Vobs/Yem = Ju — yh)/(1 + yh) (Einstein (1905d), p. 912), for yh is the speed with which the source's 
mirf (at the time of emission) recedes from the observer's mirf (at the time of reception). The same 
redshift must be observed in Z,, where h is of course the height of the observer above the source, 
and yh the potential difference from the former to the latter. To a first approximation, the redshift 
AV = (Vem — Yobs)/Yem * Yh. From bottom to top of a tower 22.6 metres high (h ~ 7.538 x 1078 
light-seconds) such as the one used by R. V. Pound and his associates (1960, 1965), Av should 
amount to a frequency loss of some 2.46 x 10° '°Hz—given that y = 9.8 m/s? = 3.269 
x 1078s~!. The value measured by Pound and Snider agreed with this prediction to one part in 
100. For a detailed historico-critical study of the redshift as a test of Einstein's theory of gravity, see 
Earman and Glymour (1980b). 

13. The Equivalence Principle also admits another interpretation, according to which it is 
weaker than Einstein’s EP, yet stronger than the Weak Principle. This “semistrong” principle 
asserts the equivalence of all non-rotating free-falling local frames in the vicinity of any given 
worldpoint, but not the equivalence of frames which do not happen to be near the same 
worldpoint. A local version of Special Relativity is valid then in each sufficiently small spacetime 
region, but the numerical contents (speed of light, gravitational “constant”, fine structure 
“constant”, etc.) of the several versions can vary from one region to another. Imitating Einstein we 
may say that there is as yet no inducement to espouse the Semistrong Principle. Cf. the apt 
comments of W. Rindler (1977), pp. 19 f. 

14. Einstein (1912c), p. 1063. 


Section 5.3 


1. We should not forget, however, that Newton himself was aware of its usefulness: in the form 
of Corollary VI to the Laws of Motion it enabled him to deal with the Earth-Moon system, that 
falls freely in the field of the Sun, as if it were inertial. See page 287, note 14, and H. Stein (1977a), 
p. 19f. 

2. Einstein (1911a), p. 900; Kilmister (1973), p. 131. 

3. The formulae derived from the EP for the energy loss and the frequency shift experienced 
by radiation as it overcomes the pull of gravity are beautifully consistent with the law of 
proportionality between frequency and radiated energy postulated in Einstein (1905a) in order to 
explain the photoelectric effect. Einstein however does not mention it here, probably because his 
1905 hypothesis of light quanta was still far from obtaining general acceptance in 1911. 

4. Einstein (1911a), p. 906; Kilmister (1973), p. 137. Einstein's statement is somewhat 
misleading, for the Light Principle of Special Relativity was never supposed to hold in the 
accelerated systems to which systems at rest in homogeneous gravitational fields are assimilated 
here. There is, however, a genuine and deep difference between “the ordinary Theory of Relativity” 
and Einstein’s evolving theory of gravity, insofar as, according to the latter, the actual world 
cannot admit a global Lorentz chart (cf. p. 137). 

5. Einstein (191 1a), §3, before eqn. (2a); cf. Einstein (19075), p. 459 n. 1. 

6. Though the effect is significant it is easily swamped by thermal effects on the surface of the 
Sun: (Einstein (1913a), p. 1266.) As I indicated above in note 12, the gravitational redshift was 
confirmed, with an estimated uncertainty of 1° by Pound and Snider (1965) in the homogeneous 
field near the ground, in Cambridge, Mass. In 1977, Vessot and Levine used for the same purpose a 
hydrogen maser frequency standard placed on a spacecraft that travelled on an orbital arc with a 
maximum altitude of 10,000km. Preliminary results reported by I. I. Shapiro (1980a), p. 473, 
indicate that the ratio of the measured to the predicted redshift differs from unity by 5 parts in 10°, 
with an uncertainty of | part in 10*. 

7. This is half the deviation of 1.75” predicted by Einstein’s mature theory of gravity. See 
pp. 173; 322, note 23. 
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8. M. Abraham (1912a, b, c, f). Abraham (1912d, e) are specifically directed against Einstein. 
9. G. Nordstr6m (1912a, b, 1913, 1914), On Nordstrém’s theory, see Einstein (19134), Einstein 
and Fokker (1914), Whitrow and Morduch (1965), Misner et al. (1973), p. 429. 

10. G. Mie (1912a, b; 1913). 

11. Einstein (191 2a, b). Ina third paper, Einstein (1912c) discussed the Machian implications of 
his theory of the static field—see below, Section 6.2. Einstein (1912d) is the very instructive reply to 
Abraham (1912d); Einstein (1912e) is a five-line note putting an end to the controversy with 
Abraham. 

12. Einstein and Grossmann (1913); see also Einstein (1913a, 6). As we shall see in Section 5.4, 
the main steps leading to the new approach can be traced in Einstein’s papers of 1912. 

13. The three postulates regarding (i) natural clocks, (ii) freely falling particles and (iii) the 
spacetime direction of light signals give a definite physical meaning to the gravitational metric g. 
(ii) is introduced in Einstein and Grossmann (1913), § 2, (i) and (iii) in § 3. (iii) implies that the null 
cone determined in each tangent space of the spacetime manifold by the metric g meets the 
requirements of Special Relativity. It is clearly spelled out as a separate postulate in Einstein 
(19134), p. 1256: “The law of light-propagation will be determined by the equation ds = 0.” 

14. M. v. Laue (1920), E. T. Whittaker (1928). 

15. In Abraham’s correction to his (1912a), in Physikalische Zeitschrift, 13 (1912), p. 176. The 
magnitude of the 4-velocity, (dx‘/dt) (dx‘/dt) is —c? (Section 4.3, note 17). Hence d@/dt = 
—Fidx'/dt = —(d?x!/dt?) (dx'/dt) = — (1/2) (d/dt) (dx'/dt) (dx'/dt) = cde/dt. (I have adjusted 
all signs to the signature (— + + +) used by Abraham.) 

16. A Lorentz metric ona 4-manifold is the Minkowski metric q only if the Riemann tensor R is 
identically zero (p. 145). A tedious but straightforward calculation shows that not all the 
components of R relative to the coordinates x, y, z, ¢ will vanish in Abraham's spacetime. Writing 
x° for t, x! for x, etc., we have that 

F oe 
Reno = ¢ayaayh 


which usually will not be zero. Note in particular that the time-time component of the Ricci tensor 
is Rog = —cV7e. 

17. M. Abraham (1912c), p. 312. 

18. Let B stand for |(1 —v?c~ 7)" '/?|. The equations dt’ = B(dt —ve~*dx),dx' = B(—vdt + dx), 
dy’ = dy,dz’ = dz (cf. eqns (3.4.18)) are not integrable, for example, in a static gravitational field in 
which c depends on x but not ont. The conditions ¢B/¢t = —dBv/@x and — OBuc~?/ét = CB /0x 
cannot be met in this case, for the left-hand sides vanish identically, while the right-hand sides 
generally do not vanish (Einstein (1912a), p. 368). 

19. Einstein (1912a), p. 355. 

20. M. Abraham (1912d), pp. 1057 f. 

21. M. Abraham (1912/), p. 794. 

22. “Eine Missgeburt der Verlegenheit”. Einstein to Sommerfeld, 29 October 1912; in 
Einstein/Sommerfeld (1968), p. 26. 

23. G. Nordstrém (1912a), p. 2; cf. (19125), pp. 872 ff. 

24. G. Nordstrom (1913), p. 533; (19125), pp. 873, 878. 

25. Nordstrém (1913), p. 534. He gives credit for this idea to Laue and Einstein. 

26. Namely, whenever the gravitating body can be treated as a particle, according to the theory; 
hence, in particular, when it is falling in a static, homogeneous field (Nordstrém (1913), p. 554; 
cf. p. 540). The dependence of the gravitational factor on the potential in Nordstrém’s theory is 
briefly discussed—in English—by J. Mehra (19745), p. 7. 

27. Nordstrém (1913), p. 547. 

28. Nordstrém (1913), p. 550. Nordstrém adds: “Hence, it is not necessary to transport 
measuring rods and clocks from one place to another in order to compare lengths and times at 
different places.” 

29. It will be instructive to prove this in detail, in order to see where the different assumptions 
come in. We shall evaluate the g, at an arbitrary worldpoint P. Let v bea vector at P, tangent to the 
worldline of a light pulse that goes through P in the direction of increasing or decreasing x’. Let 
v! = v(x‘) be the i-th component of v relative to x. Evidently, v? = v> = 0,andv! = +v°, with each 
solution holding for some admissible choice of v. By the last postulate in proposition C (p. 139), v 
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must be null. Hence the energy E(v) = g,dx'(v"C/@x" dx (rkC/ax*) = gv! v4 = 0, Substituting 
the above values for the spatial components v*, we obtain: (v°)? (goo + 911 + 2Go1) = (8°? (Goo 
+91; —2go,) = 0. Since v° # 0, it follows that ggg = —g,, and go, = 0. Substituting first x? and 
then x3 for x’ in the characterization of the light pulse tangent to v we prove, by the same argument 
that goo = —922 = — G33, and that go. = go; = 0. Since the space coordinates x? are Cartesian, 
their parametric lines are orthogonal at every intersection and g,, = 0 whenever x # 8. 

30. gis what is now usually called a conformally flat metric, characterized by the vanishing of the 
Weyl tensor. See, for instance, Eisenhart (1950), pp. 89-92. 

31. Einstein and Fokker (1914), p. 327. 

32. Mie (1912a), p. 527; cf. pp. 513, 524. Mie (1913), p. 30. 

33. Mie (1913), p. 45. 

34. Einstein (1913a), p. 1263. 

35. Mie (1913), p. 50. Cf. note 5 to Section 5.2. The above remarks do no justice to Mie’s 
complex and subtle theory. As far as I know, the best summary in English is still that given in Weyl 
(1950), pp. 206-17. 


Section 5.4 


1. In his inaugural lecture “On the Hypotheses that lie at the Foundation of Geometry” (1854), 
Riemann showed that physical geometry cannot be derived a priori from the mere concept of 
space, for the latter is compatible with many diverse geometrical structures. If space is continuous 
its metrical properties cannot be figured out by counting points. Hence, he concluded, the 
foundation of the metrical relations in physical space must be sought in the nature of the physical 
forces that keep it together (Riemann (1867), p. 23; see below, p. 344, note 44). 

2. See Torretti (1978a), p. 73, where I quote and discuss the relevant passage of Gauss (1827), §6. 

3. The curvature of a plane curve at a point measures the rate at which the curve is changing 
direction—i.e. “departing from straightness”—as it passes through that point. The curvature’s 
sign—positive or negative—indicates the sense in which it bends. See for instance, J. J. Stoker 
(1969), pp. 21 ff; or my own short discussion in Torretti (1978), pp. 68 ff. 

4. Gauss (1827), §11. Let g denote the metric on the surface S—ie. the restriction to S of the 
Euclidean metric of space—and let u be a chart of S with coordinate functions u' and u?. Then, 
E = g(é/éu', d/éu'), F = g(6/éu',é/eu?) and G = g(@/Cu?, O/eu?). The line element on S is 
therefore given by 

ds? = E(du')* + 2Fdu'du? + G(du?)? 


Gauss showed also that the surface element on S is dS = \/(EF —G7)du' du’. If we now let H 
stand for ,/(EF —G?), and represent differentiation with respect to the i-th coordinate function 
by acomma followed by i (e.g.: E,, = GE/du'), we have that the Gaussian curvature K satisfies the 


equation 
(eta) Cee aR ee 
KH + 
2EH a 2EH 2 


Thus K depends only on E, F and Gand their first and second derivatives with respect to u! and u?. 

5. Gauss (1827), §12. 

6. See, for instance, Choquet-Bruhat et al., AM P, p. 315. By a geodetic path I mean the range of 
a geodesic. Cf. p. 94. 

7. Schur (1886) proved, moreover, that if the sectional curvatures all have the same value at 
each Pe M, they are everywhere the same, and M isa manifold of constant curvature. For a proof, 
see Kobayashi and Nomizu, FDG, vol. I, p. 202. 

8. It is known also as the Riemann curvature tensor, or the Riemann-Christoffel tensor. Its 
components relative to an arbitrary chart are given in formula (D.3) on p. 281. Unless there is 
danger of confusion, R will stand for the Riemann tensor in any of its forms, covariant, 
contravariant or mixed. 

9. 1/a = pp(v, v) Up (w, w)— (uple, w))?. 1/ Ja is the area of the parallellogram spanned by 
v and w. For a proof of equation (5.4.2), see Spivak, CIDG, vol. II, pp. 196 f. 

10. Two (0,4)-tensor fields R and S on M, which satisfy equations (5.4.1), are identical if 
R(X, Y, X, Y) = S(X, Y, X, Y), for every pair of vector fields X and Y on M. The proof is simple and 
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constitutes a good exercise for the reader. Consider the tensor field T = R — S, which also satisfies 
(5.4.1). Show first that if T(X, Y,X, Y) = 0, for arbitrary X and Y, then T(X, Y, Z, Y) = 0, for 
arbitrary X, Y and Z. From this infer that T(X, Y, Z, W) = 0 for arbitrary X, Y, Zand W. Use the 
fact that T is linear on each argument. 

11. As explained in Appendix (Sections C.6, C.8), the (1,3) Riemann tensor on a manifold 
endowed with an arbitrary Riemannian metric p is precisely the curvature tensor defined by the 
Levi-Civita connection determined by pu. Cf. p. 101. 

12. See Appendix, eqns. D.1, D.2, D.3 (p. 281). 

13. Some authors use the expression “locally Minkowskian” (or “locally Euclidean”) to indicate 
that a neighbourhood of each point of a manifold is approximately isometric to a region of 
Minkowski spacetime (resp. Euclidean space). This condition is met by every Riemannian 4- 
manifold with a Lorentz metric (resp. by every proper Riemannian 3-manifold), but the 
approximate local isometries are only as good as the local curvature will allow. The mere fact that 
the tangent space at a point has a Minkowskian (resp. a Euclidean) inner product—as it obviously 
does everywhere, by definition, on the manifolds under consideration—says nothing whatsoever 
about the value of the Riemann tensor and the manifold’s departure from flatness at that point. 

14. A manifold is path-connected only if it is connected in the topological sense, i.e. only if it 
cannot be decomposed into two mutually disjoint non-empty open sets. Two results are pertinent 
here: (A) If M is a connected n-manifold, the following statements are equivalent: (i) M admits a 
proper Riemannian metric; (ii) M is metrizable, i.e. M admits a distance function whose open balls 
constitute a base of M’s topology; (iii)the topology of M has a countable base; and (iv) M is 
paracompact, i.e. every open covering of M has a locally finite open refinement. (B) An n-manifold 
M which admits a proper Riemannian metric also admits a Lorentz metric if and only if it admits a 
non-vanishing field of line elements, ie. a smooth assignment of a pair of equal and opposite 
vectors at each Pe M. Since the latter condition is always met unless M is compact and its Euler 
number differs from 0, we may conclude that a relativistic spacetime can have almost any topology, 
provided that it is paracompact and that, if it happens to be compact, its Euler number is 0. (A)is 
proved in Kobayashi and Nomizu, FDG, vol. I, p. 271; for a proof of (B) see Markus (1955) or 
Hawking and Ellis (1973), p. 39 f. 

15. Cf. Einstein’s letter to Sommerfeld of October 29th, 1912; Einstein and Sommerfeld (1968), 
pp. 26 f. English translations of the relevant passage are found in R. McCormmach (1976), p. xxviii; 
J. Stachel (1980a), p. 13, etc. 

16. Quoted by J. Stachel (1980a), p. 12n. 

17. See his comments on “the most important point of contact between Gauss’s theory of 
surfaces and the general theory of relativity” in the Princeton Lectures of 1921 (Einstein (1956), 
pp. 61-4). 

18. Einstein and Fokker (1914), p. 328, concluding paragraph. See however Grossmann’s 
remarks in Einstein and Grossman (1913), p. 36. 

19. Einstein (1949), p. 66. 

20. The evidence has been gathered and brilliantly interpreted by J. Stachel (1980a). 

21. Born (1909c); cf. Born (1910). 

22. Quoted by Stachel (1980a), p. 2; not in Einstein/Sommerfeld (1968). On the same day, Paul 
Ehrenfest submitted to the Physikalische Zeitschrift a note on “Uniform Rotation of Rigid Bodies 
and the Theory of Relativity”, in which he presented what soon became known as the “Ehrenfest 
paradox”. Ehrenfest proposes the following criterion of “relative rigidity”, which, he claims, is 
equivalent to Born’s: a body Bis relatively rigid (relativ-starr) if the deformations it suffers in the 
course of an arbitrary motion are such that each infinitesimal element of B exhibits at each 
moment, to an observer at rest, the Lorentz contraction prescribed by the momentary velocity of 
the element's midpoint. He then derives the following “contradiction”: A relatively rigid disk of 
radius R is slowly set in rotation about its axis. Let R’ be the disk’s radius as measured by an 
observer at rest when the rotation attains the angular velocity w. Obviously, 22R’ < 22R, for each 
line element on the circumference of the disc is Lorentz-contracted as it moves in its own direction 
with momentary speed R’|w|. And yet R’ = R, for each line element along a radius of the disk has 
zero momentary speed in the longitudinal direction and consequently is not Lorentz-contracted. 
The only sensible solution to Ehrenfest’s paradox was given by Einstein ina letter to J. Petzoldt in 
1919: “A rigid circular disk at rest must break up if it is set into rotation, on account of the Lorentz 
contraction of the tangential fibres and the noncontraction of the radial ones. Similarly, a rigid 
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disk in rotation (produced by casting) must explode as a consequence of the inverse changes in 
length, if one attempts to put it at rest” (Quoted by Stachel (1980a) pp. 6f.). 

23. Einstein (1912a), p. 356. 

24, Einstein (1916a), p. 774; English translation in Kilmister (1973), p. 146. 

25. This assumption is explicitly introduced by Einstein in a footnote to the discussion of the 
rotating frame in the Princeton Lectures (Einstein (1956), p. 60). We may call it the rod hypothesis, 
by analogy with the clock hypothesis mentioned on p. 96. I must emphasize, however, that, in 
contrast with the latter, the rod hypothesis cannot be formulated exactly, for there is no such thing 
as the momentary inertial rest frame of an accelerated body: at an agreed moment its several points 
generally will be comoving with different inertial frames. That is why rods ultimately cannot hold 
their own as instruments of fundamental measurement in General Relativity. 

26. Cf. Einstein (1916a), pp. 774 f. (Kilmister (1973), pp. 146 f.); Einstein (1956), pp. 59 ff., and 
the “popular exposition”, Einstein (1954), pp. 79-82. The above discussion depends essentially on 
the possibility of counting the rods that can be placed end to end along C, and made to leave their 
imprints on C,. This possibility is evidently implied by Einstein's demand, in the final step of his 
argument in the Princeton Lectures, “that at a definite time ¢ of [F, ] we determine the ends of all 
the rods” (Einstein (1956), p. 60; my italics). If the said possibility is granted, our circumferences 
can only be measured approximately, to some agreed degree of accuracy. But one can hardly 
expect exact results from Standard Rod Geometry. The important fact is that the length of C, in 
G, will exceed 2nR by any agreed difference as R|@| approaches c. Some authors give a semblance 
of rigour to their treatment of the problem by expressing the length of the standard rod as a 
differential which they integrate over C,. The semblance is deceptive, however, and their 
arguments not a whit stricter than Einstein’s, because, as I pointed out in note 25, the concept of 
the momentary inertial rest frame of an accelerated body, involved in the evaluation of the Lorentz 
contraction of the rods placed along C,, cannot be made exact. (Cf. the otherwise highly 
commendable discussion in M@¢ller (1952), pp. 222-6, 237-44, where the inevitable inexactness 
creeps in on p. 237, line 17 from below.) I have deliberately refrained from discussing the question 
of time on the rotating frame, because it is quite complicated and would distract us from our main 
concern in this Section; cf. Arzeliés (1966), pp. 217-35. 

The rotating frame in flat spacetime is the subject of a vast and growing literature. See the 
bibliography (to 1961) in Arzeliés (1966), pp. 241-3, partially completed, to their respective dates, 
by Grdn (1975) and Griinbaum and Janis (1977). Unfortunately, much of this literature is marred 
by the persistent confusion of Einstein's analysis of the standard rod geometry of the relative space 
of a uniformly rotating frame, with Ehrenfest’s problem concerning the mechanical behaviour of a 
material disk set in rotation from rest (see note 22). Our circle C, can of course be represented by a 
uniformly rotating disk—if that helps the imagination—but we must not forget that, by 
hypothesis, the rim of this disk must always lie exactly on the circumference of C,. 

27. Einstein (1912c), p. 1063 f. 

28. Thus, for instance, ordinary two-dimensional polar coordinates map a plane from which a 
ray has been excised, non-isometrically onto the region (0, 27) x (0, x) ¢ R?. 

29. Cf. Einstein (1912c), p. 1063, quoted on p. 137. 

30. “Ich vermochte lange nicht einzusehen, was dann die Koordinaten in der Physik tiberhaupt 
bedeuten sollten” (Einstein (1934), p. 253). 

31. Einstein and Grossmann (1913), pp. 4-8; Einstein (1913a), p. 1256; (19136), pp. 285 f.; 
(19144), p. 177; (19146), pp. 1032-4, 1044-6; (1916a), § 13. 

32. The functions T'},—given by (5.4.65) in terms of the components g,; of an arbitrary 
Riemannian metric—were introduced by Christoffel (1869). For a modern approach to their 
definition see Appendix, Part C. 

33. See p. 312, note 13. 

34. In “Notes on the Origin of the General Theory of Relativity”, (Einstein, (1934), p. 254). 

35. Einstein (1914a), p. 177. The context deserves attention. After writing out eqns (5.4.3) and 
(5.4.6a)}—labelled, respectively, (2) and (1a}—Einstein comments: “This law of motion ((2), (1a)) is 
derived initially for the case of a particle that moves quite freely, on which therefore no 
gravitational field is acting (when it is judged from an appropriate reference frame). But, as 
experience shows that the law of motion of a particle in the gravitational field does not depend on 
the body’s material and as it should always be possible to state that law in the Hamiltonian form 
[6 {ds = 0], it is plausible to consider (2, 1a) generally as the law of motion of a particle on which 
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no forces other than gravitational forces are acting. This is the core of the ‘Equivalence 
Hypothesis’ ”. 

36. To suggest as some have done that the worldline of a freely falling particle must be a 
geodesic because, according to the EP, each little segment of it is mapped onto a straight in R* bya 
suitable local Lorentz chart, is like arguing that the parallels of latitude on the surface of the Earth 
are no less geodesic than the meridians, because they agree, for instance, with the avenues that 
cross Chicago from East to West, which are no less straight than those leading from North to 
South. In a Riemannian manifold every curve is “straight in the infinitesimal”. 

37. Einstein (1934), p. 219. 

38. Einstein (19125), p. 458. The paper was submitted on March 23, 1912, but the passage I am 
quoting was added in proof. It must have been printed before June 5, 1912, when Max Abraham 
submitted his (1912d), where Einstein’s (19125) is mentioned on p. 1056 n. 2. 


Section 5.5 


1. The current conception of a geometrical object as a unique entity which manifests itself 
through its several chart-dependent arrays of components did not take root until the thirties—as 
far as I know the expression “geometrical object” was first used in this sense by Schouten and van 
Kampen (1930) and was popularized by Veblen and Whitehead (1932). But Einstein had 
something like it in mind when he wrote to Michele Besso on October 31st, 1916: “Definition of 
tensors: not ‘things that transform thus and thus’. But: things that can be described, with respect 
to an (arbitrary) coordinate system by an array of quantities (A,,) obeying a definite 
transformation law.” Einstein/Besso (1972), p. 85. Cf. Einstein (1916a). p. 782. 

2. Einstein (1916a), p. 776; Kilmister (1973), pp. 147 f. Dominic Edelen (1962), p. 36, maintains 
that the requirement of general covariance has to be met only by equations governing observable 
physical quantities. However, it is questionable that a statement regarding unobservable quantities 
referred to a particular chart or family of charts of unknown physical meaning can have a definite 
sense. 

3. One can also have the following surprise: the point transformation x~'-y maps timelike 
onto spacelike curves! 

4. See, for instance, R. H. Dicke (1962), pp. 15 f. The great Soviet physicist V. Fock (1957, 1959) 
was, to my knowledge, the first one to emphasize that Einstein’s General Relativity is not really 
relativistic in the more obvious sense of the word. But Einstein himself was well aware of it too; cf. 
his criticism of Emden’s understanding of the theory in a draft letter to Sommerfeld of January 
1926 (Einstein/Sommerfeld (1968), pp. 102 f.). 

5. Einstein (1913a), p. 1257. 

6. The coordinate transformations associated with the maximal atlas of a differentiable 
manifold constitute what Veblen and Whitehead (1932), pp. 37f., called a pseudo-group. 

7. For a lucid, penetrating discussion of “Invariance, Covariance and the Equivalence of 
frames”, see J. Earman (1974). 

8. F. Kottler (1912), p. 1689, n. 1. Kottler’s work, which is mentioned by Einstein and 
Grossmann (1913), p. 23, was an attempt to formulate spacetime physics in the language of “a 
generalized vector analysis [. . .] for arbitrarily many dimensions, arbitrary metric and arbitrary 
coordinates” (Kottler (1912), p. 1669). 

9. See pp. 101 and 274f. 

10. See p. 275. 

11. Einstein and Fokker (1914), p. 322. Cf. Einstein (19130), p. 287: “The method of the absolute 
differential calculus enables us to generalize the equation systems of any physical process, as they 
are found in the original Theory of Relativity, so that they fit into the scheme of the new theory. 
The components g,, of the gravitational field always occur in such [generalized] equations. 
Physically this means that the equations inform us about the influence of the gravitational field on 
the processes of the region under study. The simplest example of this kind is the law of motion ofa 
particle, given above.” (Einstein refers to our equations (5.4.3) and (5.4.3*).) See also Einstein 
(1914a), p. 177, §5. 

12. If fis a smooth mapping of an n-manifold M into an m-manifold M’, the “pull-back” f* 
sends (0, r)-tensor fields on M’ to (0,r)-tensor fields on M. If w is a (0,r)}-tensor field on M’ 
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and X',..., X* are vector fields on M, then, at each PEM, (f*m)p(X)b,... Xp) 
= Os plhpXp-- + fypX'p}—where f, is the differential of f, defined on p. 260. If w is the 
Euclidean metric on the plane and/ is a Mercator projection of the Earth, /*w is the metric defined 
by assigning to each pair of points on the region of the Earth covered by the projection the distance 
between their images on the Mercator map. With this metric, Canada is about twice as large as 
Brazil, and the shortest route from Moscow to Washington goes through Berlin. 

13. Compare two maps of North America, made by stereographic projection on the same scale, 
but centred respectively in Denver and Detroit. The distance between the images of Toronto and 
Montréal in each map will be found to be quite different. 

14, Just how small a neighbourhood must be chosen for a preset agreement—or how good an 
agreement can be guaranteed on a pregiven neighbourhood—depends, as we know, on the 
behaviour of the Riemann tensor near P. By using the diffeomorphism f defined on p. 144 
instead of any arbitrary diffemorphism of Up into Wp, these relations are optimized and made 
definite. 

15. Consider a vector field Z on the spacetime (#’, g). Let R be the covariant Riemann tensor. If 
X and Y are vector fields, we write [X, Y] for the Lie derivative of Y along X (p. 262). We shall 
denote by R(X, Y)Z the vector field defined by the equation: R(W, Z, X, Y) = g(W, R(X, Y)Z), 
which must hold for every vector field W, if X, Y and Z are fixed. It can be shown that 


R(X, Y)Z = VxVyZ — Vy VxZ — Vx, vJZ (*) 


(See, for instance, Kobayashi and Nomizu, F DG, vol. I, p. 133, Th. 5.1; J. V. Narlikar (1979), p. 46, 
Lemma 2.) Since [¢/¢x/, @/@x*] = 0 for any pair of elements 0/€x/, ¢/Cx* of the holonomic tetrad 
field determined by achart x, V,V,Z = V,V,Zifand only if R(@/éx/4, 6/@x")Z = 0. Such will be the 
case if and only if the Riemann tensor is identically 0, i.e. if and only if the metric g is flat. 
16. “Das allgemeinste Gesetz, welches die theoretische Physik kennt”. Einstein (19135), P 287, 
17, To obtain (5.5.4b) develop T'., according to (5.5.1). Recall that Ij y= (l// =) 
(¢ J-@ g/Cx"), and substitute. ite. (Cf. Synge and Schild, TC, eqn. 2.542.) Multiply both sides of the 


resulting equation by g;, ,/ —g and sum over repeated indices. 

18. Einstein (1914), p. 1053, remarks that the frequent occurrence in his formulae of tensor 
components multiplied by ,/|g| justifies the introduction of a special notation. The product of a 
tensor component Af by the square root of the absolute value of the determinant of the metric 
components gj; is to be denoted by the corresponding Gothic letter, followed by the same indices, 
thus: Uf. The assignment of an array of such components (2g) to each chart he proposes to call 
volume tensors or V-tensors. He observes that, since ,/|g|d*x = \/|g|dx°dx' dx? dx? “is a 
scalar”, an array of V-tensor components multiplied by d*x yields the array of components ofa 
tensor field i in the usual sense. Einstein's V-tensors, known in the literature as “tensor densities” or 
“relative tensors of weight 1”, have subsequently been defined as genuine, though somewhat far- 
fetched, geometrical objects. (See, for instance, M. Spivak, CIDG, vol. I, pp. 183 ff.) We cannot go 
further into this matter. Let me recall however that if x and y are two charts with the same domain, 
if J stands for the determinant of the Jacobian matrix (@x'/@y/) and if unprimed and primed 
components are referred, respectively, to x and y, then, given that gj; = (@x"/Ay") (Ax*/y/)gyy, the 
determinant g’ equals J’. COnRAUEHNY: if Wis a (q, r)-tensor density formed, as indicated above, 
from an ordinary tensor field A,and U's = Vial A; ‘sis one of its components relative to x, 
the homonymous component of WU relative to y is 


pipes. bigest hy... hg dy! Ayle Gxtt — Oxt 
wu Jive i= = via A bce bo eed Ox Gxt Gylt” Byhe” 
On tensor densities, see B. F. Schutz (1980), pp. 128 f.; D. Lovelock and H. Rund (1975), 
pp. 102-17. C. Misner et al. (1973), p. 501, aptly comment on the rationale for resorting to such 
objects. 

19. Einstein (19135), p. 288; (1914a), p. 178. 

20. Einstein (1915a), p. 782. This is especially clear in Einstein (1914), where, by the way, he 
denotes the Hj, by in. In his (1915a), p. 783, Einstein observes that, in view of the geodesic law 
(5.4.7*}—above, p. 151—the gravitational field components are more adequately represented by 
the functions defined by (D.1), p. 281, which he thereafter calls I'/,, as we do to this day. Indeed, 
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as he points out, this view is confirmed by (5.5.5) for, as a short calculation will show, the right- 
hand side is equal to ['j, 1 /. (Readers of the original papers should bear in mind that in 1915 and 
1916 Einstein defines 'j, as minus the right-hand side of eqn. (D.1). The now standard sign 
convention is followed in the 1921 Princeton lectures.) 

21. Einstein and Grossmann (1913), p. 11, eqn. (11). Einstein uses Greek letters for the 
contravariant forms of G and T and for the indices ranging over the first four natural numbers. 

22. Einstein (loc. cit., pp. 11 f.) only states conditions (i) and (iv). He probably thought that (ii) 
and (iii) were too obvious to mention. Note that (iii) does not entail that the GG", must vanish 
identically, tor ecery choice of the g;;. 

23. Einstein and Grossmann (1913), p. 35. 

24. The symmetries of the Riemann tensor, (5.4.14), entail that R*,,; = g™Rigij = — 2” Rani 
= ~—R',,, = Oand R‘,, = —R*,, = —Rj,. Cf. Einstein (1915a), p. 781. 

25. See, for instance, J. L. Anderson (1967), pp. 360-2, or C. Misner et al. (1973), pp. 412- 16. 

26. The components of the Riemann tensor satisfy the Second Bianchi Identity: Rayij:4 + Ravjesi 
+ Rasa; = 0 (p. 280). Multiplying both sides by g*/g" and summing over repeated indices, we 
obtain, using (5.4. 1a): 


o Rein — 9 Rows g Raw: =0 (a) 
If we suitably relabel the dummy indices and recall that g'/., = 0 we see that (a) implies that: 
(—2R/,+6*,R',).; =0 (b) 
Multiplying both sides by —(1/2)g"* and summing over k we finally verify that 
(R* —1g"Ri)., = 0 {c) 


27. Einstein (1917), (1931). See below, pp. 6.18 f. Observe that the factor of g’/ in (5.5.10) is the 
sum of the diagonal components or trace of the Ricci tensor (in its mixed form): R™g,, = R*,. This 
is plainly invariant under arbitrary coordinate transformations and hence belongs to ¥ (¥# ). It is 
called the curvature scalar of (# , g). 

28. In the 1921 Princeton lectures Einstein motivated the field equations (5.5.11) by invoking a 
similar mathematical result, which however, from a physical point of view, is far less compelling 
than Lovelock’s theorem. The modified Einstein tensor is the only tensor field on a Riemannian n- 
manifold which satisfies the following four conditions: (a) it is symmetric; (b) its covariant 
divergence vanishes identically; (c) its components involve only the g,, and their first and second 
derivatives; (d) its components are linear in the second derivatives of the g,;. This theorem was 
proved rigorously by Cartan (1922). In the Appendix to the fourth edition of Raum-Zeit-Materie, 
H. Weyl had stated it without proof, while noting that it could be easily demonstrated by the same 
method employed there for proving a related theorem of H. Vermeil (1917), concerning the 
curvature scalar (See Weyl (1921a), p. 288). In the physically relevant case, in which n = 4, the 
Cartan-Weyl theorem is substantially weaker than Lovelock’s, which does not presuppose 
conditions (a) and (d). 

29. Einstein and Grossman (1913), pp. 36 and 11. John Stachel (19805) explains Grossmann’s 
opinion on the Ricci tensor as follows: According to Einstein’s theory of the static gravitational 
field—published in his (1912a,b) and reasserted in Einstein and Grossmann (1913), § 1—the line 
element on sucha field is given in terms of a suitable chart by ds? = c? (dx°)? —dx7dx*, where c? is 
a function of the space coordinates x*. To evaluate the components of the Ricci tensor we may, if 
the field is very weak, set c? = 1 — 2, fora Newtonian potential ®, and neglect all but the linear 
terms in ® and its derivatives. It turns out that Rog = V?®, and R,y ® (7@/0x77x*, where V? 
stands for the Laplacian. “While at the first sight this looks promising, since Rgg does reduce to the 
Laplacian of goo (without any coordinate conditions, as it would for any static metric), a moment’s 
thought shows that the result is really a disaster for any attempt to base gravitational field 
equations on the Ricci tensor. For if we set the Ricci tensor equal to zero (as we must outside the 
sources of the gravitational field), it follows from R,, = 0 that go can depend at most linearly on 
the coordinates. Thus, it cannot represent the exterior gravitational potential of any (finite) 
distribution of matter, static or otherwise” (Stachel (19805), p. 8). Einstein and Fokker (1914), p. 
328n., emphatically dismiss Grossmann’s rejection of the Riemann tensor and its derivatives as 
viable ingredients of the left-hand side of (5.5.7). Unfortunately they do not explain why they 
thought that Grossman’s grounds—whichever they were—were worthless. 
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30. The basic calculations will be found in Einstein and Grossmann (1913), pp. 37 f. and 15 ff. 
Equations (5.5.12a) were first given in this form in Einstein (1913a), p. 1258, and in a note 
appended to Einstein and Grossmann (1913*), p. 261. Noting that g,g", = —g'gy.,, we obtain 
the streamlined form of (5.5.12a) introduced in Einstein (19145), p. 1077: 


(/-9 9" Hie), = — (&/2) (Th + ti) 
On the same page, the equations for the tj are put in the form (5.5.125) for the first time. (Observe, 


however, that the gravitational constant denoted by x in the said passage of E.’s (19145) is only half 
as large as in the earlier papers and in our formulae.) 


Section 5.6 


1. Section 5.6 was rewritten after two long conversations with John Stachel. In its present form 
it owes much to Stachel’s thought-provoking views and illuminating remarks. 

2. Einstein (1913a), p. 1257. Here and elsewhere I substitute Latin indices (i, j,. . .) for the 
Greek indices (p, v,...) used by Einstein, whenever the latter range over the first four natural 
numbers. 

3. Einstein and Grossmann (1913), p. 18. 1 propose that we read “erstere nur” instead of “nur 
erstere” in line 5 from below. 

4. Einstein to Lorentz, 14th August [1913]; Princeton Einstein Archive, Microfilm Reel 16. 
Italicized words are underlined in the original. The last sentence reads in German as follows: 
“Wenn also nicht alle Gleichungssysteme der Theorie, also auch Gleichungen (18) ausser den 
linearen noch andere Transformationen zulassen, so widerlegt die Theorie ihren eigenen 
Ausgangspunkt; sie steht dann in der Luft.” Equations (18) of Einstein and Grossmann (1913) are 
equivalent to the field equations (5.5.12a) given above. 

5. Equations (19) of Einstein and Grossmann (1913) are equivalent to our (5.5.6). 

6. Einstein to Lorentz, 16th August [1913], p. 2; Princeton Einstein Archive, Microfilm Reel 16. 
Thanks to Enrico Fermi we can now give a definitive reason why the exchange of energy and 
momentum between matter and the gravitational field cannot be expressed by a generally 
covariant system of differential equations. If y is an arbitrary worldline of the spacetime (W, g) 
there is always a chart x defined on a neighbourhood of y such that the “gravitational field 
strengths” I"), derived according to (D.1) from the components of g relative to x will all vanish at 
every point of y (Fermi (1922); cf. Levi-Civita (1927), §4; O’Raifertaigh (1958)). Therefore, if y is the 
worldline of a particle, one can always find a chart relative to which the field exerts no force and 
does no work on the particle. The status of gravitational energy in General Relativity was criticized 
for this reason by E. Schrodinger (1918a) and H. Bauer (1918). Einstein (1918d) boldly replied that 
the energy of the gravitational field need not be localized pointwise (see especially loc. cit., pp. 451 
f.) This raises of course a difficult problem with regard to the nature of gravitational radiation, 
which cannot be represented simply by a sort of analogue of the Poynting vector. See F. A. E. 
Pirani (1957). On the subject of energy and energy conservation in General Relativity see also A. 
Trautman (1962). 

7. Einstein to Lorentz, 16th August [1913], p. 3. 

8. Einstein (1913a), pp. 1257 f., immediately after the passage quoted at the beginning of this 
Section. The same idea turns up in an undated letter to Ernst Mach with New Year's greetings 
which, according to Vizgin and Smorodinskil (1979), p. 499, was “evidently written on the eve” of 
January 1, 1913 and hence “contains the first known outline of the geometrical theory of 
gravitation”. The exact dating of Einstein’s discovery of the said idea in the letter to Lorentz of 
August 16, 1913, proves in my opinion that the letter to Mach must have been written one year 
later, circa January 1, 1914. The letter to Mach was first published by F. Herneck (1966). It is given 
in English translation by Vizgin and Smoridinskii, loc. cit. 

9. Einstein (1913a), p. 1257, column 2, note 2. A similar announcement is made in the main text 
of Einstein (1913), p. 289. This reproduces a lecture delivered to the Swiss Society of Natural 
Scientists on September 9, 1913, two weeks before the Vienna lecture. However, we do not know 
whether the remark in question was not added later. 

10. Princeton Einstein Archive, Microfilm Reel 13. 
11. Einstein and Grossmann (1913*), p. 260. The “Outline” circulated first as a booklet 
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published by Teubner in Leipzig, without these additional notes. A copy of it was mailed to Ernst 
Mach before June 25th, 1913 (see Einstein’s letter to Mach of that date in Honl (1966), p. 27). 

12. As I noted on p. 100, this view only makes sense with regard to global charts (on spacetimes 
that admit them). However, the charts considered in our paraphrase of Einstein’s argument are 
global on the submanifold which is their common domain. 

13. Consider the following elementary example. Let f,:R?—R be given by f;(a,b) 
=a’?+b, while f,:R? +R is given by f, (a,b) =f, (a,5)+J, (b, a). f, plainly depends on f,. 
However, 2 = f,(1, 0) # f.(—1, 0) = 0, although f, (1, 0) = f,(—1, 0) = 1. 

14. I have found the objection in B. Hoffmann (1972), p. 161. 

15. Einstein (19146), p. 1067. In Einstein (1914a), p. 178, the change in the argument is 
introduced in a footnote purporting to clarify the meaning of the main text, which essentially 
repeats the original version in Einstein and Grossmann (1913*), p. 260. This, and the fact that the 
original version reappears almost unaltered in Einstein and Grossmann (1914), p. 217, which was 
finished after Einstein (1914a) had been submitted on January 24th, 1914, may warrant J. Stachel’s 
suggestion that Einstein always understood his argument in the same sense, as dealing with the 
geometric objects themselves, not just with their chart-dependent components. 

16. Einstein (19145), p. 1067. 

17. The second interpretation seems to me more likely because Einstein says that G’(x) is 
referred to the coordinate system K. 

18. Since the charts y and z agree everywhere except on the hole H whose boundary lies in the 
interior of their common domain, they must have the same range. Hence the point transformation 
f=y ‘+z is defined on the entire domain of z, which it maps onto itself. 

19. See above, p. 316, note 12. 

20. Write fP for f(P). The typical element of G(x) is gy p(4/éz' ‘Tepe O/€24| pp) 

= (6x S162" |p) (ey f/02" | p) By p(C/C "Ip pe C/EN"| 5 p) = By pa pC/CZ| phe Su plC/C2" |p) 


= (f* ghp(E/Cz'| p, 0/2 | p). 

21. It follows that, for any set of vector fields X,,...,X, on M, w(X,,...X,)= 
OA TAX yy. 6 (h-*-h),X,) = oly (A,X). ty A,X) = (AO toh X 1. yX,) 
=h,w(h,X,,... h,X,). This shows in what sense w is ‘dragged along’ by h,. 


22. We prove it directly by a simple computation: y, (f,g) = (y')*(f"')*g = (0 1-37" )*g 
= (Qs) tg = (27 ')*g = 2,8. 

23. Einstein (1915c), p. 832. 

24, Princeton Einstein Archive, Microfilm Reel 9. 

25. Einstein/Besso (1972), pp. 636. 

26. According to the equation of geodesic deviation. See p. 343, note 38. 

27. Note that fis not an isometry of (U, g) onto itself; for g # f*g. It is, however, an isometric 
mapping of (U, g) onto the distinct Riemannian manifold (U, f,g}—same underlying manifold, 
but different metric!—because, by our definition, if X and Y are any two vector fields on U, g(X, Y) 
= fe XS) 

28. The relevant passage of the letter to Ehrenfest had been reproduced in English translation 
by Earman and Glymour (1978a), p. 273, but with a mistranslation that mars its sense. 

29. Newton, “De gravitatione et aequipondio fluidorum”, in Hall and Hall (1978), p. 103. See p. 
285, note 7. 

30. The viability of such a conception of spacetime is cogently argued for by Mortensen and 
Nerlich (1978). 

31. See, for instance, the anthology by S. P. Schwartz (1977). 

32. This is not to deny that two things can be really different even if they are empirically 
indistinguishable and they cannot, on any grounds, be regarded as instances of distinct conceptual 
structures. Differences may well exist which our concepts and observations cannot grasp. But such 
differences—which anyway would lie outside the purview of physical science—are irrelevant to the 
present discussion. Here we are concerned with a hypothetical case proposed by ourselves, and the 
physical situations considered are in no sense more different from one another than, in the light of 
our assumptions, we can make them out to be. 

33. Einstein’s “hole” argument, as presented in his (19145), p. 1067, was criticized also by David 
Hilbert, in his second paper on “The Foundations of Physics”, submitted to the Géttingen 
Gesellschaft der Wissenschaften on December 23, 1916. According to Hilbert, Einstein's argument 
involved a misunderstanding of the physical Principle of Determinism (Kausalitdtsprinzip) as it 
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applies to generally covariant theories. A generally covariant differential system with n unknowns 
cannot comprise more than n — 4 independent equations and must therefore be underdetermined, 
because otherwise the arbitrary choice of a chart, subject to 4 coordinate conditions, would 
generate an overdetermined and hence generally inconsistent system (Hilbert (1917), pp. 59-61 
and note | on p. 61). Thus, for example, the 10 field equations of General Relativity (5.5.10) are 
constrained by the 4 identities G4 = 0, where G is the proper Einstein tensor. Hence no more 
than 6 equations are independent and the system does not suffice to determine the 10 unknown 
functions g; . 


Section 5.7 


1. Einstein/Besso (1972), p. 53. Einstein adds: “Hence it turns out that there are transform- 
ations representing accelerations of a very varied nature (e.g. rotations) which map the equations 
on themselves (e.g. rotations also).” 

2. Einstein and Grossmann (1914), p. 225: “Die Kovarianz der Gleichungen [ist] eine denkbar 
weitgehende”. Einstein was aware, of course, that any physical theory could be given a generally 
covariant formulation—-he did not have to wait for E. Kretschmann (1917) to enlighten him on 
this point. However, the hole argument (Section 5.6) had persuaded him that a generally covariant 
substitute for the Einstein-Grossmann equations (5.5.12) would “not be particularly interesting 
froma physical or a logical point of view” (Einstein (19144), §9, p. 179). See also Einstein (19134), 
p. 1258, column 1, lines 37-9. 

3. Einstein and Grossmann (1914), p. 219, n. 2. 

4. Einstein to Sommerfeld, 15th July 1915 (Einstein/Sommerfeld (1968), p. 30). Einstein’s visit 
to Gottingen probably stimulated Hilbert to work on a theory of gravity based on Einstein's 
programme. See p. 175. 

5. Einstein, (1915a, b, c,d), submitted on November 4th, 11th, 18th and 25th, respectively. The 
alternative systems of field equations are proposed in the first, second and fourth paper. The third 
paper derives the precession of Mercury’s perihelion from the second system. 

6. Einstein/Sommerfeld (1968), pp. 32 f. As I noted in the Introduction, this section contains a 
few passages which presuppose some acquaintance with the calculus of variations and its 
application to physical field theories. Readers lacking it may skip over them or turn to Lovelock 
and Rund (1975), Chapters 6 and 8, which furnish a very clear introduction to the subject, 
especially attuned to the needs of the student of Relativity. 

7, Einstein (1915a), p. 778. 

8. Einstein (1915a), pp. 778 f. 

9. Observe, by the way, that if spacetime admits one J-atlas it willadmit infinitely many distinct 
J-atlases, obtained from the first by changing the scale of the charts. Such atlases belong to different 
maximal J-atlases, for the Jacobians of the rescalings are not equal to 1. 

10. Einstein (1915a), p. 781. See above, p. 318, note 24. 

11. I modify Einstein’s notation. In the 1915 papers he denotes the components of the Ricci 
tensor by G;, and uses R;, for our U;;. The last equality in (5.7.3c) follows from a remark on p. 317, 


note 17, which entails that Tj, = (log / —g),. 

12. A proof is given in Einstein (1916a), §15 (where L is denoted by H). 

13. Remember that in Einstein's printed version of (5.5.12b) our Hi, were denoted by I'j,. The 
new formula for the tj is therefore typographically almost identical with the old. Note however the 
changed position of j and k in the last term. The factor ./ —g can be divided out because, as a J- 
scalar, it is not affected by the allowed transformations. It is therefore otiose to distinguish 
between J-tensor fields and J-tensor densities. 

14, Bear in mind that g'{ = —(g"T,+9"T%,), as a simple calculation can show. 

15. According to our remark in note 14, g"T¥, = (1/2) (gi; —g*T i}, —g™ ,). Substitute this 
value in the first term of the left-hand side of (5.7.10) and differentiate both sides with respect to x". 
The right-hand side vanishes by (5.7.7). 

16. Use the remark in note 14. It entails that g ,g,,; = —214, = —2(log \/—9)x- 

17. Einstein (1915a), p. 785. Einstein adds that the condition (g'/,;—xt',) =0, on which 
(5.7.13) depends, “says how the coordinate system must be adapted to the manifold”. I take this to 
mean that the said condition must be regarded as a convention restricting the choice of charts— 
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like eqns (5.7.2) in the Einstein-Grossmann theory. It is worth observing, however, that if 
(«tt —g'/ ,;) is equated with any arbitrary constant C, the same calculation that led to (5.7.13) will 
show that, if g is constant, xT/, = C, a rather unpalatable result. Cf. note 21. 

18. Cf. eqns (4.5.12) on p. 117. Substitute g‘/ for n‘/, multiply both sides by g,; and sum over i 
and j. 

19. Einstein (19155), pp. 799 f. On the “electromagnetic view of matter” see p. 290, note 18. 
Gustav Mie’s Theory of Matter was initially predicated on the tentative assumption that “the 
electric and magnetic fields, the electric charge and the charge current are entirely sufficient for 
describing all phenomena in the material world” (Mie (1912a), p. 513). These objects were 
supplemented, in the Theory’s third instalment, with quantities representing the gravitational 
field. Cf. Mie (1913), pp. 27 f. 

20. Einstein (19155), p. 800. 

21. Einstein (19155), p. 800. Why is the adoption of (5.7.15) “made possible” by the hypothesis 
that T} = 0? A straightforward answer to this question can be found using the Second Bianchi 
Identities, as on p. 318, note 26. Equation (a) in that note entails that 


Rh. = 2Ri;; (d) 
substituting —xTj for Rj, in agreement with (5.7.15), and dividing through by — x, we find that 
Thea = 2Tj,; (e) 


According to the fundamental law (5.7.18), the right-hand side of (e) must vanish. Hence, 0 = T}., 
= Tj, (for T} is a scalar field), and Tf = const. By making a virtue out of this necessity, the 
hypothesis in Einstein (19155) paves the way for eqns (5.7.15). It is unlikely however that an 
argument along these lines deterred Einstein from adopting the generally covariant field equations 
(5.7.15) in the first paper of November 1915. The Second Bianchi Identities, though published by 
Bianchi in an Italian journal in 1897, are notoriously absent in the German literature of the 1910s. 
See J. Mehra (1974), p. 49 and note 242. 

22, Anexact solution of the same problem was soon found, independently, by K. Schwarzschild 
(1916—submitted January 13th) and J. Droste (1916—submitted May 27th). 

23. The results of the British Expedition, solemnly announced by the Astronomer Royal on 
November 6th, 1919, were hailed by the world press as sufficient proof of Einstein’s theory of 
gravity. However, their level of accuracy was well below the usual standards of astronomy and 
optics. (Earman and Glymour (1980a) have moreover accused Eddington of doctoring the data.) 
The bending of light-rays in a gravitational field predicted by General Relativity has been verified 
with reasonable accuracy only in the last decade, by the method of very-long-base-line radio 
interferometry. Let 1.75” (1 + 7)/2 be the observed value of the deviation. Fomalont and Sramek 
(1976) obtained 7 = 1.01 + 0.02. See the brief discussion in I. I. Shapiro (1980a), pp. 474-7. 

24. Einstein (1915c), The observed value of Mercury's secular perihelion advance amounts to 
some 5600”. Almost 90°, is accounted for by the Earth’s own motion. Over 9%, is attributable, 
according to Newton's Law, to the perturbing action of the other planets. But there remains a 
balance, estimated by G. M. Clemence (1947) as 43.11”"+0.45” per century, which classical 
astronomy was unable to explain. The beautiful agreement of this figure with Einstein's ts 
unfortunately somewhat messed up by uncertainties regarding the shape of the Sun. See R. H. 
Dicke and H. M. Goldenberg (1974), A. R. Hill and R. T. Stebbins (1975), C. M. Will (1979), 
pp. 55-57. I. I. Shapiro (1980a), pp. 481-3. 

25. Multiply both sides of (5.7.17) by g‘/, sum over i and relabe! dummy indices to obtain 
Rf = xT}. Substituting this value for xT} in (5.7.17) yields eqns (5.5.10) in covariant form. 

26. Einstein (1915d) p. 846. The sneer quotes are Einstein's. 

27. Compare eqns (5.7.3), (5.7.6*). 

28. In the following calculations use the hints given in notes 14, 15, 16. 

29. Both (5.7.10) and (5.7.21) formally correspond to the Einstein-Grossmann field equations. 
Compare them with the formulation of the latter given on p. 319, note 30. 

30. Einstein (195Id), pp. 846f. I take it that in this passage Einstein states a reason for 
preferring the field equations (5.7.17) to eqns. (5.7.15}. Thus, I cannot agree with J. Earman and 
C. Glymour (19785), p. 305, for whom “the only explicit reason” that Einstein gives here for 
changing the field equations is that Tt and t} occur symmetrically in (5.7.22) and (5.7.23). 
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31. Einstein (1916a), p. 803 f.; Kilmister (1973), p. 156. 
32. Einstein (1916a), p. 807; Kilmister (1973), p. 160. 
33. Einstein (1916a), §§17, 18. = 


34. For a detailed proof see Lovelock and Rund (1975), pp. 305-323. ,/ —g Rf is not the only 
scalar density that can be constructed from the components of the metric tensor and their first and 
second derivatives, but it is certainly the simplest. The following facts contribute further to single 
out the Einstein field equations among all conceivable alternatives: (i) There does not exist a scalar 
density which depends solely on the g,, and their first derivatives. (ii) If L is a scalar density 
depending only on the g,; and their first and second derivatives and if the corresponding Euler- 
Lagrange equations E'/(L) = Oare of the second order—and not of the fourth order, as one would 
normally expect—then E‘/(L) equals R'/ —(1/2)g'/ Rk+ Ag'4, ie. the modified Einstein tensor 
(Lovelock (1969); Rund and Lovelock (1972), pp. 43 f.). 

35. See, for instance, A. Papapetrou (1974), pp. 122 f. 

36. Hilbert (1915). The so-called Einstein tensor, with components R, , — (1/2), RX, first came 
out in print on p. 404, line | from below. Hilbert’s paper is analysed by J. Mehra (1974), pp. 24-31. 
On Hilbert’s relations with Einstein in 1915-16, see J. Earman and C. Glymour (1978)). 


Section 5.8 


1. For a detailed up-to-date discussion of the subject with abundant references to the 
enormous literature, see J. Ehlers, ed. (1979), especially the contributions by P. Havas (pp. 74 ff.) 
and W. G. Dixon (pp. 156 ff.). | thank John Stachel for correcting several errors in the first draft of 
this Section and for calling my attention to Havas's work. 

2. I follow Einstein’s view of the matter. See, for example, Einstein and Grommer (1927), 
pp. 2f.; Einstein and Infeld (1949), p. 210. But the above analysis of the relation between field 
equations and equations of motion in classical electrodynamics ought to be qualified in some 
respects. As we saw on p. 113, the Lorentz force law can be inferred from the Relativity Principle 
and Maxwell's equations. Moreover, as is well known, the equation of continuity (charge 
conservation) provides conditions of integrability of the Maxwell equations, whence not any 
arbitrary charge distribution but only one which satisfies the equation of continuity is compatible 
with them. 

3. According to J. Stachel (1969a, 1972) no such constraints need arise in the case of a 
gravitational field with an external source. In that case one describes the source by a given energy 
tensor T and seeks to integrate the inhomogeneous Einstein equations (5.7.17) or (5.5.10). Since 
these equations entail that the contravariant energy tensor components T‘/ relative to any chart 
must satisfy the conservation equations 


TY, =0 (5.5.4a) 


and the covariant divergence in (5.5.4a) involves the very metric components that are the 
unknowns for which one is trying to solve the field equations, it seems at first sight that the energy 
tensor cannot be arbitrarily assigned. However, a theorem proved by C. Pereira (in his 
unpublished doctoral thesis of 1967) apparently removes all restrictions on the choice of the 
energy tensor 7. According to Pereira's theorem, if (T“/) is a 4 x 4 symmetric matrix of scalar fields 
ona spacetime region U and if (g,;) are the components of an arbitrary Lorentz metric g relative to 
achart x defined on U, one canalways find on a neighbourhood of each point Pe U achart ” such 
that, if we treat the T'/ as the components relative to y? ofa (2, 0) tensor density J, it turns out that 
the covariant divergence of T vanishes. (The covariant divergence must of course be computed 
using the transformed metric components (Cx"/Cy') (Cx*/CW) g),-) Thus any arbitrary energy tensor 
could apparently be reconciled with the need of satisfying eqns. (5.5.4a). The reader will not fail to 
observe that, while the scalar fields T'/ are completely arbitrary, their exact physical interpretation 
as energy tensor components depends on the structure of the charts y" and will probably be quite 
different in their several domains. In particular, there is no guarantee that any particle worldlines 
defined by those interpretations will be geodesic or will have any other preestablished shape. Not 
to mention the fact, appositely stressed by Stachel (in a conversation with the author), that such an 
arbitrary array of scalar fields interpreted as energy tensor components relative to prospective 
charts will usually turn out to be physically inane. 
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4. Einstein and Grossmann (1913), p. 10. 

5. W. Rindler (1977), pp. 182 f. See also A. S. Eddington (1924), pp. 126 f. 

6. H. Weyl (1923), §38. See also § 36 of the fourth edition on which the English translation is 
based. 

7. A. Einstein and J. Grommer (1927); see also A. Einstein (1927). 

8. A. Einstein and J. Grommer (1927), p. 4. See the elegant reconstruction of Einstein and 
Grommer’s basic argument in J. Ehlers (19805). 

9. A. Einstein and N. Rosen (1935), p. 73. 

10. Singularities in General Relativity are the subject of Section 6.4. As we shall see there, 
several proposals have been made for supplementing relativistic spacetime with a set of ideal 
points comprising the so-called singularities. However, in my opinion, such ideal points cannot be 
regarded as possible locations for events (e.g. events in the history ofa particle) without drastically 
modifying the theory. In order to view them as such one would have indeed to equate the physical 
world with the Schmidt completion of relativistic spacetime (p. 217) or some other no less abstruse 
mathematical contraption, as step which, as far as I know, nobody has yet ventured to take and 
which appears to be quite unwarranted. 

11. Cf. Einstein and Infeld (1949), p. 209: “All attempts to represent matter by an energy- 
momentum tensor are unsatisfactory and we wish to free our theory from any particular choice of 
such a tensor. Therefore we shall deal here only with gravitational equations in empty space, and 
matter will be represented by singularities of the gravitational field.” Einstein, Infeld and 
Hoffmann (1938), p. 65, reject the assumption of a specific energy tensor for matter because such 
tensors “must be regarded as purely temporary and more or less phenomenological devices for 
representing the structure of matter, and their entry into the equations makes it impossible to 
determine how far the results obtained are independent of the particular assumption made 
concerning the constitution of matter”. Less formally, Infeld (1957), p. 398, remarks that Einstein 
always thought that to use the field equations (5.5.10) instead of the vacuum field equations in 
which T'/ = 0 “is somehow in bad taste, because we do not know in [5.5.10] what [T‘/] is, and we 
mix a geometrical tensor on the left side with a physical tensor on the right side”. 

12. L. Infeld and A. Schild (1949), p. 410. To avoid misunderstandings I have substituted Latin 
for Greek indices in agreement with our usual convention. 

13, Fock’s theory is expounded at length in Chapter VI of his (1959). Its first formulation was 
simultaneously published in Russian and French in 1939, N. Petrova (1949) carried Fock’s original 
approximation one step further. A. Papapetrou (1951a) gave a different derivation of the 
equations of motion of finite masses from the field equations, based on an approach similar to 
Fock’s. In his recent text-book A. Papapetrou (1974), pp. 152-7, discusses the motion of test 
particles from this point of view. 

14. To obtain the general relativistic form of the energy tensor of a perfect fluid substitute the 
general metric components g'/ for the Minkowski components y‘/ In equations (4.5.17*). 

15. On spinning test particles see A. Papapetrou (19515), E. Corinaldesi and A. Papapetrou 
(1951). 

16. In his encyclopedia article on General Relativity, P. G. Bergmann (19625), p. 245, comments 
on the inadequacies of both Einstein’s and Fock’s work on the subject. 

17. P. Havas (1979), p. 121, after repeating the argument of his earlier paper with J. N. Goldberg, 
emphasizes that the Geodesic Law is not “the full law of motion for an extended body” or “for 
particles which carry both monopole and dipole moments”, but only an approximation to such full 
laws. 

18. P. Havas (1979), p. 121. 

19. W. G. Dixon (1970a, b, 1974/75, 1979), J. Ehlers and E. Rudolph (1977), R. Schattner 
(1979a, b). I follow J. Ehlers (19805). 

20. J. Ehlers (1980a), p. 283. 

21. J. Ehlers (1980a), p. 283. 

22. J. Ehlers (19805). 

23. The domain of dependence D (S) of a subset S of a causal space was defined on p. 124. The 
concepts of causal geometry can be extended to arbitrary relativistic spacetimes as explained on 
page 192. Cf. also pp. 247 f., 252 f. 

24. On the Cauchy problem and the dynamical formulation of General Relativity see the review 


NOTES 325 


articles by Y. Choquet-Bruhat and J. W. York (1980), A. E. Fischer and J. E. Marsden (1979) and 
C. Teitelboim (1980). The articles by Bruhat (1962) and Arnowitt, Deser and Misner (1962) in 
Witten’s Gravitation are still recommended as an introduction to more recent research. 

25. Einstein (1913a), p. 1266. 

26. Einstein (1918a), p. 154. 

27. Einstein (19184). Cf. Einstein (1916), p. 692. 

28. H. Bondi, F. A. E. Pirani and I. Robinson (1959). 

29. Weinberg (1972), p. 251. The mathematical difficulties of the subject arise from the non- 
linearity of Einstein’s field equations. The main conceptual difficulty was mentioned on p. 319, 
note 6, where I referred the reader to Pirani’s trailblazing article of 1957. Due to the extreme 
weakness of gravitational interaction (roughly 10~*° times as strong as electromagnetism), 
gravitational radiation is very hard to detect. At the time of writing, the strongest evidence for its 
existence was that presented by J. H. Taylor et al. (1979): The binary pulsar 1913 + 16, presumably 
formed by two neutron stars orbiting about each other in 7.8 hours, appears to be losing energy 
and angular momentum at the rate expected from gravitational radiation due to the orbital motion 
of its constituents. Gravitational radiation is explained informally in a pleasant book by P. C. W. 
Davies (1980). For more rigorous presentations the reader may turn to any good graduate 
textbook of General Relativity, or to Pirani (1965). Grishchuk and Polnarev (1980) is a recent 
survey, with up-to-date references. On the current state of experimental research, see Douglass and 
Braginsky (1979). 

30. The best recent surveys on the subject are to be found in Ehlers, ed. (1979). 

31. See Geroch (1977). On pp. 216-8 we touch on the subject of ideal points at infinity, in 
connection with spacetime singularities. 

32. The conformal treatment of null infinity was introduced by Roger Penrose; see Penrose 
(1964, 1968). To define a manifold-with-boundary we consider first the n-dimensional half-space 
H" c R". This consists of all real number n-tuples with non-negative first element. The boundary 
of H" (in R") is the (n —1)-manifold {0} x R"~'. We view H" as a topological space with the 
topology induced by the standard (Euclidean) topology of R”. An n-manifold-with-boundary M is 
defined like an n-manifold (p. 257), except that the charts of M are bijective mappings of subsets of 
M onto open sets of the half-space H". A point P € M belongs to the boundary 0M c M ifand only 
if Pis mapped bya chart ona point of the boundary of H”. (The reader ought to satisfy himself that 
this property of being charted into the boundary of H” is invariant under coordinate 
transformations.) 

33. By “Schwarzschild mass” we understand primarily the integration constant that turns up in 
Schwarzschild’s exact solution to Einstein's field equations in vacuo (designated by m in the 
Schwarzschild line element, as given on p. 335, note 2). The concept can be extended to other static 
solutions. 

34. One must bear in mind, however, that because electricity can be both attractive and 
repulsive, whereas gravity, in the short range, is purely attractive, electromagnetic radiation starts 
with dipole terms, while gravitational radiation in the linearized approximation starts with 
quadrupole terms. 

35. The linearized or weak-field approximation to General Relativity is obtained by setting the 
metric components g; ; equal to n;; + y;;, with y; ; small; thus assuming that spacetime is very nearly 
flat. See Einstein (1916d), or Rindler (1977), pp. 188 ff. 

36. See the review article by W. L. Burke (1979) and the references given there. 


Chapter 6 


Section 6.1 


1. There are several survey articles on special questions, such as the brilliant 100-page 
monograph on “Singularities and Horizons” by Tipler, Clarke and Ellis (1980), and I know of two 
long histories of 20th-century cosmology, by J. N. North (1965) and J. Merleau-Ponty (1965}— 
both of them published by a curious coincidence in the same year in which the discovery of the 
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background thermal radiation by Penzias and Wilson tipped the scales in favour of relativistic 
cosmology and triggered a new spurt of interest in it. But there is as yet no historical work on 
General Relativity that can be compared, say, with Max Jammer’s books on Quantum Mechanics. 

2. Let v, w be vectors at P, Q €.M, respectively. Q = tP for some te R*. w is parallel to v if and 
only if w = t«p(v). 

3. If R* acts transitively on.# asa Lie group of translations a global chart x:.# — R’* is defined 
by setting x (¢ P) = ¢ fora fixed “origin” Pe. If x:.M — R* is a global chart, a smooth transitive 
and effective action of the additive group of R* on .@ is defined by setting @ = ¢ P whenever 
x(Q)—x(P)=t. 

4. Levi-Civita, who had coauthored with Ricci the main source for the study of the absolute 
differential calculus invented by the latter (Ricci and Levi-Civita, 1901), had kept an eye on 
Einstein’s use of it in his first theory of gravity. In the Spring of 1915 both men corresponded 
actively about what Levi-Civita saw as mathematical errors in Einstein (19145). (Princeton 
Einstein Archive, Microfilm Reel 16. On April 2, 1915 Einstein wrote to Levi-Civita: “Eine so 
interessante Korrespondenz habe ich noch nie erlebt”.) Levi-Civita’s research on parallelism was, 
he says, directly encouraged by Einstein’s theory of gravity, being guided by a desire to simplify the 
formal apparatus (“ridurre alquanto l’apparato formale”) used for the definition of the Riemann 
tensor, in view of its essential role in “the mathematical development of Einstein's grandiose 
conception” (Levi-Civita (1917), p. 173). G. Hessenberg (1917), who independently achieved a non- 
metric characterization of geodesics, curvature and covariant differentiation which is essentially 
the same as Levi-Civita’s and involves in effect a conception of vector parallelism (though the 
expression is not used), was also motivated by the wish to simplify “the vast and heavy apparatus of 
formulae” (“der umfangliche und schwerfallige Formelapparatus”) of the theory of quadratic 
differential forms, in view of its newly acquired significance for Relativity. A decade earlier, L. E. J. 
Brouwer had explicitly introduced the concept of vector parallelism along a curve in a discussion 
of Riemannian manifolds of constant negative curvature (Brouwer (1906), pp. 5, 16, 18). 

5. Levi-Civita (1977), pp. 100 ff. The first Italian edition of this book appeared in 1923. 

6. To see that the enveloping surface S’ is indeed developable consider a sequence o, of 
equidistant values of t, (ty, . . ., t,), with y (tg) = Pand y(t,) = Q. The family of planes 15, . . ., 2, is 
enveloped by a polyhedral surface S' that coincides with 7; on the region containing 7 (1;) which is 
bounded by the intersections of 2; with z;-, and with z,,,(1 <i <r). S can obviously be 
developed onto a plane by rotating its sides about its edges. Now form new sequences a), 
a3,... of values of t, by adding at each turn the midpoint between every two consecutive values in 
the preceding sequence. We thus generate an infinite sequence S,, S), S3,... of developable 
surfaces which converges to S’. 

7, An embedding f of an n-manifold M into an m-manifold M’ is a diffeomorphism of M ontoa 
submanifold of M’ (see p. 261). f is isometric with respect to the metric g of M and the metric 
g of M’ if g=/*g’. (M, g) is globally isometrically embeddable (g.i.c.) in (M’, g’) if there is an 
isometric embedding of the former into the latter. (M, g) is locally isometrically embeddable (1.i.e.) in 
(M’, g’)ifeach point of M has a neighbourhood U such that (U, g)is g.i.e. in (M’, g’). The following 
results on embeddability have been proved after the publication of Levi-Civita’s paper: (i) Any n- 
manifold can be embedded in R?"*! (Whitney, 1936) and also in R?" (Whitney, 1944a), but it may 
not be embeddable in R?"~' (Whitney, 19445). (ii) An n-manifold M with C’ -metric gis l.ie.ina 
similar m-manifold (M’, g')ifm = (n/2)(n + 3)and the number of positive and negative eigenvalues 
of g are respectively less than those of g’ (Greene, 1970); moreover, if the manifolds and their 
metrics are analytic (p. 288, note 12), m can be as low as (n/2) (n + 1), exactly as Levi-Civita (1917, 
p. 176) claimed (Friedman, 1961). (iii) Requirements of g.i.e are of course more stringent; thus, 
while an arbitrary 4-manifold with a Lorentz C’ -metric is L.i.e. in R'*, endowed with a metric of 
suitable signature, it is g.i.e., as far as we know, ina flat R®° if it is compact (Greene, 1970), and ina 
flat R°° if it is non-compact (Clarke, 1969; Clarke's result holds for a C?-metric). Further results 
and explanations on the subject, and a list of 144 references, are given by H. F. Goenner (1980a). 

8. The tangent to yat y(t)is 7,,d/dt = (dx'-y/dr) 6/éx'|,,. Substituting this expression for V in 
(6.1.1) we obtain the geodesic equations (D.8) on p. 281. 

9. Weyl (19184), p. 385; cf. Weyl (19196), p. 112. Weyl’s unified theory was first developed in his 
(1918b), submitted by Einstein to the Prussian Academy. (See also Weyl (19195), (1921c) and the 
various editions of his book Raum Zeit Materie.) Upon receiving Weyl’s manuscript, Einstein 
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wrote him that the main idea was “a stroke of genius of the first rank” (Einstein to Weyl, 6th April 
1918, Princeton Einstein Archive, Microfilm Reel 24; Einstein's emphasis). However, a few days 
later, he remarked: “So schén Ihr Gedanke ist, muss ich doch offen sagen dass es nach meiner 
Ansicht ausgeschlossen ist, dass die Theorie der Natur entspricht.” The objection mentioned in 
note 10 follows. (Einstein to Weyl, 15th April 1918, ibid.) 

10. This gave rise to the following objection to Weyl’s theory, put forward by Einstein in his 
letter of 15 April 1918 and printed at the end of Wey! (1918b): if the characteristic frequencies of 
atomic clocks depend not only on their actual spacetime locations but also on their past 
worldlines, spectral lines would be blurred. 

11. See p. 307, note 5. Weyl defines the electromagnetic field tensor as in that note and obtains 
the first set of Maxwell equations (4.5.5) as a mathematical corollary. 

12. Eddington (1921); (1924), pp. 213 ff. 

13. Componentwise, R;, = 3 (Rj; + Ry) +4 (Rij — Ryi). 

14. Einstein (1923a) proposed a system of field equations, derivable from a variational 
principle, for Eddington’s theory. Einstein (1923b,c) add further developments. The content of 
Einstein’s three papers is reported in English in Eddington (1924), Supplementary Note 14, 
pp. 257 ff. 

15. See Einstein (1930a,b), which supersede the earlier presentations in Einstein 
(1928a, b; 1929a, b). See also Cartan (1930) and Cartan/Einstein (1979), which contains the 
important correspondence that went on between them from 1929 to 1932. 

16. Cartan to Einstein, 8th May 1929, in Cartan/Einstein (1979), p. 6. Observe that the implicit 
Lorentz metric is compatible, in the sense of page 192, with the non-symmetric connection that 
governs “parallelism at a distance”, though the latter is obviously not its Levi-Civita connection. 
The field equations proposed by Einstein (1930a, p. 693) involve both the metric and the torsion 
components as the data from which the connection components are to be obtained by integration. 

17. Einstein (1956), Appendix II, pp. 133-66. 

18. Trautman (1965), p. 107, presents the matter as follows. Let I be an arbitrary symmetric 
linear connection on .M. Let the scalar field t and the symmetric (2, 0)-tensor field g be coupled to 
each other as above. Taking components with respect to a global chart x, let us postulate that: 
(i) Rin =O (where Ri,, is defined by (D.2)); (ii) Rinm (Gt/0x") = Rinm (Ot/Ox!); (iti) gi Rim 
= g'*R/y4; and (iv) g‘/, = 0. Given these postulates, Newton's theory of gravity consists of the 
vacuum field equations R,, = Oand the equations of motion (5.4.7*)—as in General Relativity, the 
Ricci tensor vanishes in vacuo and freely falling particles obey the geodesic law. See also G. 
Dautcourt (1964). 

19, The angle between two vectors v and wat a point P of a proper Riemannian manifold (M, g) 
is, by definition, equal to arc cos (gp (v, w)/(gp(v, v) gp (w, w))'/?). 

20. If visa vector at a point P and />p is an arbitrary real number, gp(v, v) is greater than (equal 
to, less than) zero if and only if the same holds for 4; Bplv, v). 

21. In the notation of note 19, the vector v is orthogonal to w if and only if gp(v, w) = 0. This 
criterion also makes sense if g is indefinite, in which case, of course, all null vectors must be 
pronounced self-orthogonal (cf. p. 92). 

22. Weyl (19216), p. 100. The theorem does imply this insofar as the free fall of particles and the 
propagation of light fix, respectively, the projective and the conformal structure of spacetime. 
However, this premiss must be taken with a grain of salt, for the timelike and null geodesics of 
General Relativity, on which indeed the said structures do depend, are not supposed to be the 
actual worldlines of real particles and light pulses, but the worldlines that such objects would 
describe under certain ideal circumstances. 

23. Ehlers et al. (1972), p. 65. The authors refer to Helmholtz (1868}—a paper I have discussed 
in Torretti (1978a), pp. 155 ff. 

24. Istrongly advice the reader to study the original paper by Ehlers, Pirani and Schild (1972) or 
the subsequent exposition by Ehlers (1973), pp. 22-37. Ehlers and Schild (1973) prove important 
results on projective structures (in Weyl’s sense). 

25. Ehlers, Pirani and Schild (1972) postulate that the radar charts must constitute a C?-atlas. 
Cf. my remarks on p. 348, note 4 to part A. 

26. Another very interesting construction of the manifold structure of spacetime along similar 
lines is that given by N. M. J. Woodhouse (1973). 
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!. See the references to Newton's water-bucket and two-globe experiments on p. 285, note 11. 

2. H. Weyl (1923), p. 221. 

3. Einstein/Born (1969), p. 258. See also Einstein to Besso, 10 August 1954, in Einstein /Besso 
(1972), p. 525. 

4. Einstein to Mie, 8th February 1918, page !; Princeton Einstein Archive, Microfilm Reel 17, 
Einstein’s “relativistic standpoint” of 1918 ultimately rests on the observationist dogma he 
professed at that time: “Das Kausalgesetz hat nur dann den Sinn einer Aussage tiber die 
Erfahrungswelt, wenn als Ursachen und Wirkungen letzten Endes nur beobachthare Tatsachen 
auftreten” (Einstein (1916a), p. 771-—in English in Kilmister (1973), p. 143; I advise reading the 
whole paragraph). 

5. Einstein to Mie, 22nd February 1918, page 1; Princeton Einstein Archive, Microfilm 
Reel 17. I have substituted Latin for Greek indices; see p. 319, n. 2. 

6. Einstein (19185), pp. 241 f. 

7. Einstein (19180), p. 241 n. Note that the German designation Machsches Prinzip means 
Machian Principle and does not entail that the principle was authored or even subscribed to by 
Mach himself. 

8. His stance changed in the course of the following decade. In a lecture delivered in 1927, on 
the bicentennial of Newton's death, Einstein mentioned as a sign of “Newton’s wisdom” the 
recognition by the latter that “the observable geometrical quantities (distances between the 
material particles) and their evolution in time do not completely characterize the motions from a 
physical point of view” and that “space must possess some sort of physical reality, [. . .]a reality of 
the same kind as the material particles and their distances” (Einstein (1934), p. 202). 

9. Einstein (1916), p. 103. The passages quoted by Einstein occur in Mach (1963), II.6, 
pp. 217, 224, 226. They include some well-known remarks on Newton’s water-bucket experiment. 
This, Mach observed, can only teach us that no noticeable centrifugal forces act on the water when 
it moves relatively to the bucket’s thin walls, whereas such forces do show up as soon as the water is 
moving together with the bucket relatively to the mass of the earth and the other sidereal bodies. 
However “nobody can say what the quantitative and qualitative outcome of the experiment would 
be if the bucket walls became thicker and more massive; eventually, several miles thick”. 

10. Einstein (1913), p. 290. 

1t. Einstein (1912a, b). In this theory, the inertial mass that occurs in the formulae for impulse 
and energy is m/c, where mis a constant characteristic of the body under consideration, and c is the 
speed of light in racuo, which—as we noted on p. 138—is proportional to the gravitational 
potential. As masses are piled up near the body, the gravitational potential decreases and m/c 
increases. “This agrees with Mach’s bold idea that inertia originates in an interaction of the 
massive particle under consideration with all the rest” (Einstein and Grossmann (1913), p. 6). 

12. Einstein (19134), p. 1260. 

13. Einstein to Mach, 25th June 1913; printed in fascimile in Honl (1966), p. 27. Compare 
Mach's remark on a thick-walled rotating bucket in note 9. 

14, Einstein (1956), pp. 101 f. On the linear approximation to General Relativity see p. 325, 
note 35. 

15. Einstein (1956), p. 103. 

16. Einstein (1917), p. 145. 

17. Einstein (1915c); K. Schwarzschild (1916). 

18. “Somit wurde die Tragheit durch die (im Endlichen vorhandene) Materie zwar beeinflusst 
aber nicht bedingt™ (Einstein (1917), p. 147). 

19. He assumes for simplicity’s sake that the metric is spatially isotropic at each point. Relative 
to a suitable chart x the line element is then given by ds? = ~ Adx*dx*+ B(dx°)’. In a first 
approximation the inertial mass ofa test particle is then equal to mAB™ ''? (where m stands for the 
particle’s proper mass). This expression will vanish at spatial infinity if A tends toO or Btends to x. 

20. Obviously Einstein had in mind the velocities of stars belonging to the Galaxy. At the time 
of his writing a substantial segment of the astronomical profession still maintained that the Galaxy 
comprises all that can be observed in the sky. The fact that most nebulae are star systems or 
galaxies by themselves, lying at enormous distances from our own, was not established beyond 
reasonable doubt until the twenties, when the 100-inch Hooker telescope at Mt. Wilson was 
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trained on them. However, by 1917 V. M. Slipher had already measured the radial velocities of 
some nebulae and had found that NGC 1068 and NGC 4594 were receding from us at about 
1000 km/s. 

21, Riemann (1867), p. 22. 

22. A topological space S is said to be compact if every open cover of S includes a finite subcover. 
Suppose that S is a compact spacelike hypersurface of a relativistic spacetime. S can surely be 
covered by a family of subsets of unit volume, open in S. Since S is compact there is a finite 
collection of at most m such subsets which also covers. S. Hence, the volume of S is not greater 
than m. 

23. S> denotes the set of all real number quadruples (a, b,c,d) such that a? +b? +c? +d? = 1. 
We look on S? as endowed with the topological, differential and metric structures induced in it by 
inclusion in R*. The metric structure in question is the proper Riemannian metric of constant 
positive curvature known as spherical or “Riemann” geometry. 

24, Wecanchoosea chart with coordinate functions ¢, y, y, 8, such that G/ét is the worldvelocity 
field of matter and ¢, w and @ define a set of hyperspherical coordinates on each hypersurface 
orthogonal to ¢/dt. Relative to this chart, matter is “at rest” and all components of the energy 
tensor vanish except T°°, which is constant and equal to the uniform energy density p (cf. eqns 
(5.7.14) and pp. 118 ff.). Recalling that T? = p (p. 172) and noting that, relative to the chosen chart, 
Jou = 9 (for a = 1, 2, 3), we can substitute the above values of the energy tensor components into 
(D. 6) to obtain: 


Roo —Gool4 + Kp) = —KP(Goo)? 


(1) 
R,,-gi(A+tKp) =0 if either ior j 4 0 


As W. de Sitter (1917¢) pointed out, equations (1) are satisfied by the g;; implied by the line 
element: 
ds? = dt? — R? (dy? + sin? p(dy? + sin? wdé?)) (2) 


where R? = 1/4. The “cosmological constant” 4 thus turns out to be equal to the curvature of each 
spacelike section of the world. De Sitter (1917, p. 231; 1917c, pp. 7 f.) also noted that solution (2) 
can be realized in two topologically inequivalent spacetimes: one in which the spherical sections 
orthogonal to 0/dt are the spherical 3-spaces isometric to.S? countenanced by Einstein, and another 
in which they are elliptical 3-spaces, in the sense of Klein (1871). Elliptical 3-space is 
homeomorphic with the real projective 3-space P* (whose points are the sets {ar|0 # we R}, for 
each non-zero re€ R*), which is harder to figure out than S°, but has the intuitive advantage that 
any two geodesics in it meet exactly once, whereas in spherical space all geodesics issuing from a 
point meet again at an “antipodal” point, However Einstein, in a postcard sent on 16th April 1919 
to Klein himself, expressed his preference for the spherical over the elliptical model. 

25. Einstein (1917), p. 149. Current data indicate that galaxies form clusters which in turn are 
grouped into superclusters up to 6x 107 lightyears wide, but that the distribution of such 
superclusters is indeed homogeneous (E. J. Groth et al. (1977)). Cf. the fascinating discussion of 
“Homogeneity and Clustering” in P. J. E. Peebles (1980), Chapter I. 

26. Einstein (1921a), p. 129: “The distribution of the visible stars is extremely irregular, so that 
we on no account may venture to set down the mean density of starmatter in the universe as equal, 
say, to the mean density in the Milky Way.” 

27. De Sitter showed that the eqns (1) in note 24, with p = 0, are satisfied by the g;; implicit in 
the line element: 


ds? = cos? pdt? — R* (dy? + sin? p(dy? + sin? ydé7)) (3) 


where R? = 3/4. He contrasted this “system B” with Einstein's “system A”, defined by (2) in 
note 24. If R° is endowed with the flat metric of signature (+ — — — —)de Sitter’s system B can be 
isometrically mapped onto the hypersurface { (t, a, b,c, d)|a? + b? +c? +d? —1? = const.'. On this 
“hyperboloid” (as on its readily visualizable analogue in R°, the ordinary hyperboloid a? + b? 
—1? = const.) geodesics orthogonal to the section = 0 recede from each other and the sections ¢ 
= const. become larger and larger as the absolute value of t increases. (The geometry of de Sitter’s 
system B was spelled out by Felix Klein (19186); cf. E. Schrodinger (1956), Ch. I; a very clear 
explanation, with a diagram, is given by W. Rindler (1977), pp. 183-7.) Hence, de Sitter’s world is 
static only because it is empty, but a configuration of test particles launched into it will not 
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conserve its shape or size as t varies. De Sitter (19175), p. 236, after quoting Slipher’s values for the 
radial velocities of NGC. 1068 and NGC. 4594 (see note 20), observed that “about a systematic 
displacement towards the red of the spectral lines of nebulae we can, however, as yet say nothing 
with certainty. If in the future it should be proved that very distant objects have systematically 
positive apparent radial velocities this would be an indication that the system B, and not A, would 
correspond to the truth”. This is, to my knowledge, the earliest suggestion that we may be living in 
an expanding universe. 

28. Cf. W. de Sitter (1917a), pp. 1221 f.: “From the purely physical point of view, for the 
description of phenomena in our neighbourhood this question [namely, the choice between 
Einstein’s solution (2) in note 24, de Sitter’s solution (3) in note 27, and the flat Minkowski 
metric—R.T. ] has no importance. In our neighbourhood the g,,; have in all cases within the limits 
of accuracy of our observations the values [n;; ], and the field-equations [D.6] are not different 
from [5.7.17]. The question thus really is: how are we to extrapolate outside our neighbourhood? 
The choice can thus not be decided by physical arguments, but must depend on metaphysical or 
philosophical considerations, in which of course also personal judgment or predilections will have 
some influence.” 

29. Einstein's remark, in his (19185), p. 243, to the effect that, as far as he knew, the modified 
field equations did not admit a singularity-free empty solution must be read in the light of his 
(1918c), submitted to the Prussian Academy on March 7th, 1918 (i.e. one day after his (19185) was 
received for publication in Annalen der Physik). He claims there that de Sitter’s solution (formula 
(3) in note 27) contains an unavoidable singularity at g = 2/2. A fortnight later, however, he 
acknowledged to Felix Klein, who had taken exception to his claim: “You are perfectly right. De 
Sitter’s world is in itself singularity-free and all its worldpoints are equivalent. A singularity arises 
through the substitution that furnishes the static form of the line element. [. . .] My critical remark 
with regard to de Sitter’s solution requires correction. There is in fact a singularity-free solution of 
the gravitational equations without matter. But this world should by no means be considered as a 
physical possibility. For it is not possible to establish in it a time ¢ such that the three-dimensional 
sections t = const. do not intersect each other and that these sections are equal (metrically)” 
(Einstein to Klein, postcard, 20th March 1918; Princeton Einstein Archive, Microfilm Reel 14). In 
other words, Einstein now rejects de Sitter’s world as a valid counterexample to Mach’s Principle 
because it cannot be partitioned by a universal time into isometric spacelike slices. Why Nature 
should be expected to meet this Procrustean demand is anybody's guess. 

30. For a more detailed analysis with numerous references to the literature, see H. Goenner 
(1970), (1980b). 

31, A. H. Taub (1951); E. Newman, L. Tamburino and T. Unti (1963); C. W. Misner and A. H. 
Taub (1969). For a brief survey of these solutions see, for instance, M. P. Ryan and L. C. Shepley 
(1975), pp. 132-46. Cf. the interesting comments on Taub—NUT space by C. W. Misner (1967). 

32. S. Weinberg (1972), p. 87. The same point was made independently in the title of an article 
by J. F. Woodward and W. Yourgrau (1972), on “The incompatibility of Mach’s Principle and the 
Principle of Equivalence in Current Gravitation Theory”; however, the authors, who propose to 
discard the strong Equivalence Principle and hence General Relativity in order to save Mach’s 
Principle, do not succeed in explaining—at any rate to someone as obtuse as myself—wherein the 
alleged incompatibility consists. 

33. See J. Callaway (1954). The same objection can be raised against the more exact treatment of 
the Lense—-Thirring problem as a perturbation of the Schwarzschild solution by D. R. Brill and 
J. M. Cohen (1966) and J. M. Cohen and D. R. Brill (1968). 

34. If yis not a geodesic and v, , v, and v, are three spacelike vectors at a point P of y which form 
an orthonormal tetrad with the tangent yp, their parallels along y at the infinitesimally distant 
point P’ do not form an orthonormal tetrad with the tangent yp: (for the latter is not parallel to yp). 
However, we do obtain such a tetrad if we take the projections of the said parallels onto the 
orthogonal complement of }p-. If this operation is repeated again and again as we advance by 
infinitesimal displacements along the range of y we generate a succession of spacelike triads which 
satisfies our intuitive notion of a system of non-rotating spatial axes (Weyl (1923a), p. 268). This 
idea can be formally expressed as follows. Let U be a field of timelike unit vectors in the 
relativistic spacetime (#g). The Fermi derivative F\,X of a vector field X along U is equal 
to VyX —g(VyU, X)U — g(U, X)VUU (where V,,X denotes the covariant derivative of X along 
U). The following properties the Fermi derivative are not hard to prove: (i) F yU = 0; (ii) if X is 


NOTES 331 


orthogonal to U—te. if g(X, U) = 0—the value of FX at each point P is the projection of the 
local value of VX on the space of vectors orthogonal to U p; (iti) if X and Y are vector fields such 
that FX = FyY = 0, g(X, Y)is constant along each integral curve of U. Let y be such an integral 
curve. A vector field X is Fermi transported or Fermi propagated along y if, on the range of y, Fy X 
= 0. A compass of inertia along y is an ordered triple of mutually orthogonal vector fields of unit 
length, defined on the range of y, each of which is orthogonal to U and Fermi propagated along y. 
If y is a geodesic, then, on its range, VyU = 0 and FyX = VyX, so that X is Fermi propagated 
along » if and only if it is also parallel along y. Thus our definition of compass of inertia along a 
timelike geodesic is just a special case of the more general definition given here. 

35. The vorticity (also called the rotation, or spin) of a fluid can be defined in several ways. The 
following concept of a vorticity tensor was introduced by J. L. Synge (1937). We represent all 
geometrical objects by their components relative to a given chart. Thus U; and U! stand for the 
covariant and contravariant forms of the worldvelocity field U, and U;., denotes the covariant 
derivative of the former. The fluid’s departure from geodesic motion is measured by the 
acceleration field U, = U;, U4. The vorticity tensor (in effect, a 2-form) is given by: 


w= 3(U;,;-U;,,+ U,U,-U,U,)) 


See J. L. Synge (1960), pp. 169-73. 

36. See K. Gédel (1949a, 1952), I. Ozswath and E. L. Schiicking (1969). Godel (1952), p. 176, 
proposed the following “directly observable necessary and sufficient reason for the rotation of an 
expanding spatially homogeneous and finite universe”: “For sufficiently great distances there must 
be more galaxies in one half of the sky than on the other half.” 

37. Einstein (1922a), p. 19. 

38. Einstein (1922a), p. 18; my italics. 

39. Einstein to C. B. Weinberg, Ist December 1937; Princeton Einstein Archive, Microfilm 
Reel 17. Einstein summed up his own position as follows: “Alles Begriffliche kann seine 
Berechtigung nur auf seine Beziehbarkeit auf das, aber keineswegs auf eine Ableitbarkeit aus dem 
sinnlich Wahrnehmbaren griinden.” 

40. Thus, for example, the general relativistic expression for the components of the energy 
tensor of a perfect fluid is 

T! = (p+ p)U'U! + pg! 
where U is the worldvelocity field, p the energy density and p the pressure. Cf. the general 
definition of the energy tensor in S. Weinberg (1972), Chapter 12, or J. V. Narlikar (1979), 
Lecture 6. 

41. Einstein to Pirani, February 2nd, 1954; Princeton Einstein Archive, Microfilm Reel 17. This 
very important letter was written in response to a manuscript by Pirani, “On Mach’s Principle and 
preferred directions in General Relativity”. The words “alle nicht auf sie zurickfihrbaren 
Begriffe”, whose translation I have placed in square brackets, are crossed out in the original MS; 
they bear witness to the association of Mach’s Principle with positivistic reductionism in Einstein’s 
mind. Cf. note 39. 

42. D. W. Sciama (1953); C. Brans and R. H. Dicke (1961); F. Hoyle and J. V. Narlikar (1964, 
1966). The subtle analyses by R. H. Dicke (1962, 19646, c), though partly outdated, are still very 
instructive from a methodological point of view. 

43. H. Hénland H. Dehnen (1963, 1964) interpret Mach’s Principle as the requirement that the 
inertial and gravitational forces that show up in each coordinate system be fully and uniquely 
determined by the sum of the energy-momentum (i.e. world-momentum) density of matter and the 
energy-momentum density of the gravitational field. In the light of this requirement the authors 
classify the solutions of the Einstein field equations into strictly Machian (Einstein’s static 
universe, de Sitter’s empty universe, the Friedmann worlds), loosely Machian (Ozswath and 
Schiicking’s finite rotating universe) and anti-Machian (Minkowski spacetime, the Schwarzschild 
solution, the Lense-Thirring solution for a rotating central body). Gddel’s infinite rotating 
universe, dismissed at first blush as anti-Machian (1963, p. 210), turns out, on a closer look, to be 
loosely Machian (1964, p. 288). 

44, J. A. Wheeler (1964) conceives Mach’s Principle as “a boundary condition to separate 
allowable solutions of Einstein’s equations from physically inadmissible solutions” (p. 306). He 
States it thus: “The specification ofa sufficiently regular closed three-dimensional geometry at two 
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immediately succeeding instants, and of the density and flow of matter-energy, is to determine the 
geometry of space-time, past, present, and future, and thereby the inertial properties of every 
infinitesimal test particle” (p. 333). Wheeler’s formulation is in itself very interesting and is directly 
linked to some highly innovative research done on General Relativity in the last quarter of a 
century (e.g. by Arnowitt, Deser and Misner on GR as a dynamical theory; by Yvonne Choquet- 
Bruhat and others on the initial-value problem; the new vindication of the virtual uniqueness of 
Einstein’s field equations by Hojman, Kuchar and Teitelboim, etc.) However, it seems to me that 
by allowing and indeed prescribing the independent specification of the global 3-geometry and its 
time rate of change ona spacelike slice of the world Wheeler exceeds the bounds of what may fairly 
be called Machian. Besides, in order to yield the required determination of the spacetime 
geometry, “past, present; and future”, the spacelike slice in question must be a global Cauchy 
surface, i.e. a hypersurface that is met once (and once only) by every maximally extended null or 
timelike curve through every worldpoint outside it. But there is no shred of evidence that there is a 
global Cauchy surface in the real world. 

45. According to Raine’s criteria a solution is Machian if (i) the metric is uniquely determined 
by the Riemann tensor, and (ii) the Wey! tensor is a linear functional of the matter current. Raine’s 
formal statement of (i) has full generality, but his—rather formidable—exact version of (ii) applies 
only to spacetimes admitting a Cauchy surface. Raine shows that Minkowski spacetime and all 
spacetimes that approach the Minkowski metric at infinity violate condition (i). Condition (ii) is 
violated by all empty non-flat spacetimes. Among the relativistic spacetimes admitting a global 
time coordinate that partitions them into geometrically homogeneous spacelike sections 
(“homogeneous worlds”) only the isotropic ones, characterized by the Robertson~Walker line 
element (6.3.2), pass Raine’s criteria, but all rotating or anisotropically expanding spatially 
homogeneous spacetimes are excluded as non-Machian. 


Section 6.3 


1. Unfortunately I can barely touch on the fascinating subject of relativistic cosmology. For a 
readable and reliable introduction I recommend the relevant chapters of W. Rindler (1977), pp. 
192-244 or J. V. Narlikar (1979), pp. 173-230. At the time of writing the best up-to-date general 
surveys of the subject were, in my opinion, A. K. Raychaudhuri’s Theoretical Cosmology (1979) 
and D. J. Raine’s The Isotropic Universe (1981). M. P. Ryan and L. C. Shepley (1975) give a fairly 
clear presentation of the mathematical theory. A thorough student, who longs for a detailed, step- 
by-step argument will take delight in S. Weinberg (1972), pp. 375-633. 

2. Kepler’s paradox of 1610, subsequently discussed by E. Halley (1720a, b), among others 
now goes under the name of Olbers, the German astronomer who revived it in 1823. For an up-to- 
date treatment see E. R. Harrison (1964, 1965), or Harrison’s article in the American Journal of 
Physics of 1977. Let me note only that Einstein's 1917 world model is in fact no less liable to the 
paradox than Newton's, for the light emitted by a star will endlessly circumnavigate the static 
spherical space until obstructed by another star. Harrison’s brilliant solution rests on the fact that 
the energy supply of the stars must run out ina time much shorter than would be required to fill the 
sky with starlight. 

3. Hubble’s estimates were based, of course, on the apparent luminosity of the objects 
concerned. Due to a confusion between two kinds of star populations—distinguished by Baade in 
the 1940s—Hubble underestimated the distance to the brightest galaxies and hence to all others. 
M 31 is now believed to be some 2,000,000 lightyears away from us and to be twice as massive as 
our galaxy. 

4, Eddington (1924), p. 162. The average velocity of the sample was + 572 km/s. The greatest 
negative (i.e. approaching) velocity in Eddington’s list was — 300 km/s (NGC 221 and 224). 

5. E. Hubble (1929), p. 109, Table 1. The rest of Eddington’s list goes into Hubble's Table.2, of 
“Nebulae whose distance are estimated from radial velocities” (by means of Hubble’s linear 
relation). 

6. See the quotation on p. 330, note 27. 

7. Eddington (1924), pp. 161 ff.; Weyl (1923a), pp. 294 f., 322 f.; Weyl (1923c); Hubble (1929), 
p. 173. Cf. notes 24 and 27 to Section 6.2. 

8. Ryan and Shepley (1975), p. 57. The authors refer specifically to the closed Friedmann 
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solutions, ie. those in which, as in Einstein’s static universe, the spacelike hypersurfaces 
orthogonal to the worldlines of matter are compact (viz. proper Riemannian 3-manifolds of 
constant positive curvature). Giordano Bruno’s world model is the “common sense” universe of 
stars without end in Euclidean space. 

9. Einstein (19225). Einstein (1923d) admitted that his objection to Friedmann was based on 
an error of calculation. 

10. G. Lemaitre (1927)}—published in English translation in the MNRAS as Lemaitre (1931a). 
Lemaitre’s assumptions are somewhat more general than Friedmann’s inasmuch as he treats 
matter as a perfect fluid—with time-dependent, spatially isotropic pressure—not as pressureless 
dust. (He even suggests that the pressure might be responsible for the expansion of the universe!) 
However, the current habit of distinguishing between “Friedmann universes” in which A = 0, and 
‘Lemaitre universes” in which A can take any value, is historically unjustified, for Friedmann 
(1922), p. 378, explicitly made allowance for an arbitrary A—“which we may also set equal to 
zero”—and studied the consequences of its satisfying several alternative inequalities. 

11. Robertson was well acquainted with Friedmann’s work but he learned about Lemaitre’s 
after his first paper was ready. Like Lemaitre he made allowance for spatially isotropic positive 
pressure. The meaning of his postulate of spatial isotropy will be made clear below. 

12. Einstein (1931), p. 236. Einstein assumes that the cosmic fluid has negligible pressure. 

13. Einstein (1956), p. 127. 

14. Friedmann (1922) only considers the case K > 0. The case K <0 is the subject of 
Friedmann (1924). The possibility that K = 0, as in the Friedmann solution countenanced by 
Einstein and de Sitter (1932), is ignored by Friedmann, probably because he did not realize that it 
also yields—in fact for every value of A—non-static solutions expanding from a point in the finitely 
distant past (or contracting to one in the finitely distant future). 

15. This opinion can be traced back to F. Ueberweg (1851) and H. v. Helmholtz (1868). It was 
very influential until the advent of General Relativity, with which, of course, it is incompatible. 
Only the (complete) proper Riemannian n-manifolds of constant curvature have the said property. 
If n = 3, the manifold must admit a 6-parameter group of motions, which can only be the 
Euclidean group, the Bolyai-Lobachevsky group, or the group of motions of spherical or elliptical 
geometry. 

16. Friedmann (1922), p. 379. 

17. See, for instance, Ryan and Shepley (1975), Chapters 8, 9 and 10. 

18. H. P. Robertson (1929), p. 823. 

19. An extension of Fubini’s result to Riemannian n-manifolds admitting an (m+ 1)m/2- 
parameter group of motions (m < n) is proved in detail by Weinberg (1972), pp. 396 ff., who also 
provides the requisite mathematical background. 

20. Robertson (1929), p. 827, n. 9, criticized Friedmann for having overlooked that his 
assumptions of class II implied that 6/0x° generates a congruence of geodesics. Friedmann had 
expressed the line element of de Sitter’s solution—regarded as a special case of his own—in the 
same form used by de Sitter himself and reproduced above on p. 329, note 27. This line element is 
referred to a chart in which the parametric lines of the time coordinate function are not geodesics. 

21. If the integral lines of 0/dx° are timelike geodesics, x° must be related to the proper time t by 
the affine transformation x° = at+B(a, BER, « € 0) along each such line. By (F2), the scale 
factor a must be the same everywhere. 

22. Asa matter of fact, (II.1b) was the only consequence that Robertson (1929) drew from his 
demand, for he had postulated (II.2) from the outset. See also Robertson (1933), p. 65 and Note C. 
However, (II.2) follows at once from the theorem by Fubini on which Robertson’s argument rests. 

23. This was Lie’s solution of the “problem of space” formulated by Helmholtz. 
(S. Lie (1890); S. Lie and F. Engel (77G), vol II], Abt. V.) The following consideration may satisfy 
the reader that every hypersurface x° = const. must have a constant curvature. Let H be such a 
hypersurface. All sectional curvatures at a point Pé H must be equal, say, to K p, because H is 
isotropic at P. But if the scalar field P+ K p varied as P ranges over H, its gradient would single out 
a particular direction at each point of H. 

24. Robertson (1935) derives in fact all three assumptions of class II from what he calls, after 
E. A. Milne, the cosmological principle. This he states as follows: The “fundamental particle- 
observers” A, A’, . . ., are “each equipped with a clock, a theodolite, and apparatus for sending and 
receiving light-signals”; “the description of the whole system, as given by A in terms of his 


334 RELATIVITY AND GEOMETRY 


immediate measurements, is to be identical with the description given by any other fundamental 
observer A’ in terms of his own measurements” (loc. cit., p. 285). As Iam not sure that I understand 
what all this means, I have preferred to glean from Robertson’s proof the premises that he actually 
uses in the construction of a global time coordinate. Let me mention also that A. C. Walker 
(1935, 1937), who independently achieved the same results as Robertson, based them on (i) “the 
cosmological principle”, according to which “the totality of observations that any observer can 
make on the system is identical with the totality of observations that any other observer can make 
on the system”, (ii) “the principle of symmetry”, to the effect that “each fundamental observer sees 
himself to be a centre of spherical symmetry”; and (iii) Fermat's principle of least action, governing 
the light-paths (A. C. Walker (1937), pp. 94, 103, 91). 

25. Relative to any chart whose time coordinate function is x° the metric components g; ; satisfy 
the following conditions: (i) go, = 920 = 0 (by (F1)); (ii) gag is equal to minus the product of a 
function g,, depending on the space coordinates only, and the square of a function S depending on 
x° alone (by F3); (iii) ggg depends on x° only (by F2). 

26. See, for instance, Weinberg (1972), p. 403. 

27, Note that in order to evaluate z one must know both vy, and v,. In practice, one assumes that 
the light received from distant galaxies was emitted in the remote past with the same frequency 
with which light of the same type—i.e. of the same atomic origin—is now radiated in our terrestrial 
laboratories. The type of light of a particular frequency is recognized by the pattern of 
neighbouring spectral lines. 

28. The calculations leading to (6.3.8) show that the only non-zero components of the linear 
connection determined by the Robertson—Walker line element (6.3.2) are TG, = S$s- "83, 
P= SSdxp. and T5, = Thy where a dash over a symbol indicates that it denotes a component of 
the “spatial” metric do or its linear connection. Substituting these values for the I"', in the geodesic 
equations (D.8) we verify at once that the parametric lines of t are geodesics. 

29. See, for instance, Weinberg (1972), pp. 381-3. 

30. Apparently Friedmann overlooked this, for he describes p as “an unknown function of the 
({four] world-coordinates” (Friedmann (1922), p. 380 n. 1), thereby suggesting that his assump- 
tions are much less stringent than they are in fact. 

31. Eliminating S from (6.3.9) we obtain “Friedmann’s equation”: 


1 (xp +4 
$= 5 (Fs 48) (+) 


The expression in parenthesis on the right-hand side of (*)is a polynomial in S that we shall denote 


by f(S). Obviously, tg can be obtained by evaluating the elliptic integral {( Ju/f(u))du from 
u = Sy tou = 0. By plotting S asa function of t we see at once that 0 < |tg| < So/So. Note that if 
we had chosen S to be the negative square root of S?, (6.3.10) would entail that S > O when / < O;in 
that case, S could not have a maximum, and would tend to 0 as t approaches tp. 

32. The different alternatives are carefully discussed by Rindler (1977), pp. 230 ff. Allowing a 
positive isotropic pressure makes no essential difference. 

33. Cf. the discussion of the Robertson—Walker line element in Rindler (1977), pp. 207 ff: 
especially the figure on p. 209. 

34. Suppose that our Friedmann world (¥ , g)contains a point P, = lim P,. Then the metric g, 
and hence the Riemann tensor R and the curvature scalar R must exist and be smooth at P,. But 
then R(P,) = R(lim P,) = lim R(P,), and we know that the latter limit does not exist. 

35. Cf. Friedmann (1922), p. 384 n.: “The time from the creation of the world is the time that has 
elapsed from the moment when space was a point to the present moment.” 


Section 6.4 


1. A complex-valued function fis said to have a singularity at a point z of the complex plane if 
the following two conditions are met in erery neighbourhood U of z: (i) f is analytic on an open 
subset of U; (11) f cannot be extended to an analytic function defined on all U. Iff has a singularity 
at 2, < is said to be a singular point of f. 

2. A star a few times more massive than the Sun collapses under its own weight when its 
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thermonuclear energy sources become exhausted (Oppenheimer and Snyder (1939)). If the star is 
and remains spherically symmetric, has zero angular momentum and is far removed from all other 
large bodies, the field surrounding it can be represented by the Schwarzschild solution to Einstein 
equations in vacuo, R;; = 0 (Schwarzschild (1916), Birkhoff (1923)), Let the Schwarzschild line 
element be given by 
-1 
ds? = (: ali Jae = (: -26m) dr? —r? (d6? + sin? @dg?) 


r 


where m is the star’s mass and the gravitational constant G = x/8x. When the contracting star 
attains a radius less than 2Gm (= 6 km if m is twice the solar mass), nothing can arrest its quick 
compression to infinite density at r = 0. A test particle or a light pulse inside the sphere r < 2Gm is 
irresistibly sucked into the singularity. In the more realistic case of a rotating star, the field at the 
final stages of collapse would most probably correspond to one of the Kerr solutions (Kerr (1963); 
Newman et al. (1965); Carter (1968, 1970, 1977)). The situation here is more complex but 
ultimately the same as in the idealized Schwarzschild case. For a clearer view of this matter see 
S. W. Hawking’s luminous article on “Black Holes” in Lerner and Trigg (1981), the textbook 
presentations in Rindler (1977), pp. 149-56, and Olianian (1976), Chapter 9, and the review of 
recent research by J. C. Miller and D. W. Sciama (1980). 

3. Such ingredients include, it is true, the complete metric space R. But even this can be built by 
“bootstrapping”, as in the original version of the above ploy, first suggested by Cauchy and carried 
out by Weierstrass. Let Q be the field of rational numbers. A sequence P,, P2,... of rational 
numbers is called a Cauchy sequence if for every positive 6€ Q there is an integer N such that 
|P,—P,| <6 whenever N <r,s. As is well known, not every Cauchy sequence of rationals 
converges to a limit in Q. We shall say that two Cauchy sequences of rationals (P,), (Q,) are 
equivalent if their “mix” P,, Q,, P2, Q2,..., is a Cauchy sequence. We define R as the collection of 
equivalence classes of Cauchy sequences of rationals. The function d: R x R — R assigns to any pair 
of “points” (P,), (Q,) € R, the Cauchy sequence of rationals (| P,, — Q,|). (R, d) is a complete metric 
space entirely built from the incomplete field Q. 

4. Kobayashi and Nomizu, FDG, vol. I, p. 72. 

5. C. W. Misner (1963). 

6. “Space-time is represented as a connected paracompact four-dimensional C” manifold 
with pseudo-Riemannian metric tensor g, , of signature minus two. [.. ] Space-time will be said to 
be singularity-free if the metric is a field of class C? and .@ is geodesically complete with respect to 
the metric” (S. W. Hawking (1966a), p. 512). “Space-time is said to be singularity-free if it is time 
complete (all timelike geodesics can be extended to arbitrary length) and if the metric is C?” (S. W. 
Hawking (19665), p. 491). “A space is said to be singular if it contains an inextendable timelike or 
null geodesic of finite affine length” (R. P. Geroch (1966), p. 446). Geroch refers to Misner (1963), 
but no such definition is to be found there. Having equated the absence of singularities with metric 
completeness in the case of proper Riemannian manifolds, Misner (1963), p. 925, cautiously 
observes that “although we lack a definition of ‘non-singular’ for Lorentz-signature Riemannian 
manifolds, we have two sufficient conditions: compactness and affine [ie. geodesic—R.T. ] 
completeness.” In the same paper he shows that one of these conditions can be had without the 
other (see page 215). Misner discussed again the interpretation of spacetime singularities in the 
paper he coauthored with A. H. Taub (1968), Section II; no definition is proposed there. 

7. Hawking and Ellis (1973), p. 258. 

8. See, for example, p. 330, note 29. 

9. Tassume that tg is the g.l.b. of the range of the time coordinate function t. It may seem that I 
thereby wantonly ignore the so-called periodic Friedmann solution, which holds if 4 < 0, or if 
4 = Oand k = 1, and can also obtain if k = 1 and 0 <A < A, (where Ag = (2/xpS°)’ is the value of 4 
corresponding to the static Einstein universe). In the case of the periodic solution the graph of S as 
a function of t is a cycloid, and the range of t can be taken to be the entire real number field minus 
the periodically recurring zeroes of S. The interval between two successive zeroes of S is the time of 
one cosmic period during which the Friedmann world expands and recontracts. The manifold # * 
mapped by t onto the punctured real line is obviously disconnected, with a different component 
for each cosmic period. Consequently, ¥ * does not underly one, but many relativistic spacetimes, 
in the sense agreed on p. 5.22. In fact, the periodic Friedmann solution confirms our wisdom in 
demanding that a spacetime manifold should be path-connected. For let P, Qe #* such that 
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t(P) < tp < t(Q),S(t,) = 0. Then no information whatsoever can be transmitted from P to Q. 
P does not causally or chronologically precede Q in the sense of the theory of causal spaces (Section 
4.6) and the only reason we have for saying that P comes before Q in time is the purely formal one 
that the interval assigned by the function 1 to the cosmic period of the former, precedes in R the 
interval assigned by the same function to the cosmic period of the latter. But the cosmic periods of 
Pand Q have very little to do with each other and shall be viewed as successive stages of one and the 
same world only if one is anxious to save by all means the eternity of Mother Nature (cf. note 26). 

10. Weare not compelled to parametrize each timelike geodesic y of (W, g) by the proper time t. 
But an affine transformation t+ at + f merely replaces t, by another finite real number. We may 
indeed parametrize all matter worldlines by ¢ = log(t—tg), whereby their domains become 
unbounded below (and equal to R if, from the outset, they are unbounded above). But ¢ is not an 
affine parameter (p. 278) and its adoption cannot alter the geodesic incompleteness of (W,, g). 
In this connection, I would like to state a theorem proved by C. J. S. Clarke (1971) which throws 
much light on the nature of spacetime singularities: If (W, g) is a strongly causal spacetime (in the 
sense defined on p. 125, i.e. if, as it is presumably the case in the world we live in, there are in (W,, g) 
no closed or almost closed timelike curves), and if it contains incomplete null geodesics, there is a 
scalar field w on W such that every null geodesic in (W, wg) is complete. Since the metrics g and 
«wg are conformal (p. 192) they both define the same causal structure on W and the paths of their 
respective null geodesics are identical. However, while the affine parameters of some of the null 
geodesics of g are bounded above or below, the null geodesics of w?g—though they include 
parametrizations of the very lightpaths which appear to “run into a singularity” in (W, g)—all have 
unbounded affine parameters, ranging from — «x to x. 

11. See, for instance, Einstein (1956), p. 129. 

12. Thus W. de Sitter (1933), p. 631, commenting on the apparent inconsistency between 
Friedmann’s “time from the creation of the universe”, as calculated from Hubble's Law, and the 
much longer period required for the formation of a star according to the then “modern” theories 
of stellar evolution, observed that the perfect spatial homogeneity and isotropy presupposed by 
the line element (6.3.2) “is an idealization of the problem, taking account of inertia only and 
neglecting gravitation, or, in other words, replacing the mutual gravitational action between the 
separate galaxies by the averaged action of all galaxies in the universe combined. This 
approximation is, of course, entirely sufficient so long as the mutual distances are large, but ceases 
to be an approximation when these distances become very small.” As we approach t = Tg “the 
mutual perturbations [. . .] will be large and different for different bodies, and [. . .] the exact 
equality of all velocities will consequently be destroyed”. Hence, de Sitter concluded, “the 
conception of a universe shrinking to a mathematical point at one particular moment of time 
[t = tg} must thus be replaced by that of a near approach of all galaxies during a short interval of 
time near [t = tg]”. 

13. More formally, (¥ ; g) is generic if it meets the following “generic condition”: every timelike 
or null geodesic y in W has a point P at which the Riemann tensor R and the tangent v = ;'p satisfy 
the inequality 
: ; UU" (DUAR jrsk — OMARirsk — CPR ran + VMaRirsn) F O, 
where v,; = gj ;v’. 

14. The weak convergence condition follows from the field equations (D.7) and the hypothesis 
that T,;V'V/ 2 0 for every timelike vector V. The latter hypothesis says that the local energy 
density of matter, as measured by an observer with worldvelocity V, is nonnegative. The strong 
convergence condition presupposes that (T,,;—39;;Ta+« 'A)V'V/ = 0 for every timelike V. This 
holds good, for example, if 2 = 0, for a perfect fluid with positive density, unless it has a large 
negative pressure. See Hawking and Ellis (1973), pp. 94 f. 

15. Proofs are given in the papers mentioned at the end of each theorem and also in the 
magnificent book by Hawking and Ellis (1973), where the mathematical and physical background 
required for understanding them is explained in detail. A less formal but highly instructive 
discussion of the singularity theorems will be found in Geroch and Horowitz (1979), pp. 258-66. F. 
J. Tipler et al. (1980), pp. 147-52 report on recent efforts to weaken the convergence and causality 
conditions presupposed by the theorems. 

16. Condition (iii) of Theorem III can be formally stated as follows. For any given future- 
directed timelike geodesic y through P we denote its field of unit tangents by V. The function 
0 = V',, is called the divergence of V. Let w be an arbitrary unit timelike vector pointing to the 
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future at P. We set f, = g;;(P)w'Vb. Condition (iii) says that there are numbers b and 6 such that, 
for each such geodesic y, 6 becomes less than — df, within a length b//, measured from P along 7. 
Condition (iv-c) of Theorem I'V says simply that the divergence 0, in the above sense, on each null 
geodesic through P becomes negative at some point to the future of P. (As noted above, we may 
substitute “past” for “future” throughout the foregoing statements.) 

17. See, for instance, Geroch (1971b). A reasonable topology for the set #(¥) of Lorentz 
metrics on a 4-manifold is, say, the following. Each ge p(W ) is a cross-section of the bundle 
T,(#) of (0, 2)-tensors over ¥. Each open set A < T,(¥#) determines a subset By < (#") formed 
by all the Lorentz metrics which map into A. The collection of all such sets B 4 constitutes a basis 
for our topology. 

18. Ryan and Shepley (1975), p. 81. Misner’s example is the torus with Lorentz line element 
ds? = —cos xdx? + 2 sin xdxdy +cos xdy?. See Ryan and Shepley (1975), p. 95. 

19, See, for instance, Misner and Taub (1968). 

20. Awareness of this example, privately communicated by Geroch, led Hawking (1967) to 
abandon the definition of a singularity-free spacetime quoted in note 6 and to view “timelike and 
null geodesic completeness as a minimum condition for regarding space-time as singularity-free” 
(loc. cit., p. 169, my italics. See also Hawking and Ellis (1973), p. 258). 

21. Hawking and Ellis (1973), p. 276. 

22. Since Schmidt’s construction depends exclusively on the canonic and connection forms of 
the principal bundle F¥ , ¥ could be just any n-manifold endowed with a linear connection I. If 
is the Levi-Civita connection of a proper Riemannian metric g, ¥ is homeomorphic to the metric 
completion of (#; g). This shows that the Schmidt completion ¥ of a manifold ¥ with a linear 
connection is a natural generalization of the standard Cauchy completion of a manifold with a 
proper Riemannian metric. 

23. The condition of being past and future distinguishing is one of a series of increasingly 
stronger “causality conditions” that may be imposed on (¥ , g). The weakest is the chronology 
condition: there are no closed timelike curves in ¥ . Then comes the causality condition: there are no 
closed causal (i.e. nowhere spacelike) curves. These two conditions are equivalent in a generic 
relativistic spacetime that satisfies the weak convergence condition (Hawking and Ellis (1973), 
p. 192, Prop. 6.4.5). To be past and future distinguishing is a stronger requirement, which is in turn 
strictly weaker than the property of being a strong causal space, defined on p. 125. A maximally 
strong condition of this kind is the condition of stable causality, which will be satisfied if the metric 
g has a neighbourhood U in the topological space of Lorentz metrics on ¥ defined in note 17, such 
that for every he U, (# , h) satisfies the chronology condition. This amounts to saying that (¥ ; g)is 
stably causal if a small perturbation of the metric g will not generate closed timelike curves in #- 

24. The following proof, drawn from Geroch, Kronheimer and Penrose (1972), p. 550, will give 
an tdea of the style ofargument in this fascinating field of contemporary mathematical physics. Let 
y bea timelike curve in ¥. I~ (7))is clearly a past set. If J” (y)is not an IP it can be represented as the 
union of two non-empty past sets, p and q, none of which is a part of the other. Choose points 
Péep—qand Qeq — p. Pand Q belong to!” (y); there is a pair of points P’, Q’ on the range of y such 
that P < P’ and Q < Q’. Without loss of generality we may assume that P’ < Q’e! (7). Since p 
and q are past sets whose union is | (), whichever of them contains Q’ contains both P and Q, 
contrary to our assumption. Consequently, /~ (7')is an IP. To prove the converse, let p be an IP of 
W. We must try to find a timelike curve y such that p = 17 (y). Choose a point Pep. Set 
a=I17(pol*(P)) and b=17-(p—I*(P)). Since p is a past set, p=ab. Since p is an IP, either 
p=aor p=b. Since Peb, p = q. If P’ is another point of p, P’ belongs to the past of the intersection 
of p with the future of P and there must therefore be a third point P” € pin the future of both Pand 
P’. Choose a countable family of points Po, P,,. . ., dense in p and a point Q, ep: in the future of 
P,. Then, for each integer j > 1 wecan finda point Q;€ pin the future of both P,;_, and Q,_ ,. Since 
each point of the sequence Q,,Q3, . . . lies in the past of his successor, there is a timelike curve in p 
that joins them all. For each Pep, the set Up = pO 1* (P) is non-empty and open in p. Since the 
family (P,,) is dense in p, Upcontains one of its members, say, P,. But P, < Q,.,. whence P, € 17 (7). 
Consequently, p < I“ (+). Since pis a past set and the range of y is contained in it, p = 17 (y).Q.E.D. 

25. See Ellis and King (1974), Ellis and Schmidt (1977); and the survey article by Tipler, Clarke 
and Ellis (1980), pp. 155-67. 

26. A theorem recently proved by F. J. Tipler (1980) in a way explains why Friedmann’s 
expanding and recontracting world cannot continue through the initial or the final singularity in 
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any genuine physical sense. For if it did so continue, it would go time and again through exactly the 
same process of evolution and involution. But according to Tipler’s theorem a relativistic 
spacetime which (i) is generic, (ii) meets the strong convergence condition, (tii) contains compact 
Cauchy surfaces, and (iv) is uniquely determined from initial data on any of its Cauchy surfaces 
(cf. p. 332, note 44), cannot be time periodic; and the expanding and recontracting Friedmann 
world satisfies conditions (i}+(iv). 


Chapter 7 


Section 7.1 


1. Aristotle, Physica, IV, 10, 21879; 1V, 11, 22075; IV, 13, 222°10-14. 

2. To see that it is so, suppose that there is such a positive lower bound, and that the time lag 
between a cause on the sun and its effect on earth, or between a cause on earth and its effect on the 
sun, can be no less than 8 minutes. Then, a sudden change on the solar surface which I observe at 
12 noon GMT is simultaneous, by the negative causal criterion, with all states of my body from 
11.44 A.M. to 12 noon GMT, although they are certainly not simultaneous with one another. 

3. Reichenbach (1928), p. 161. The stated criterion applies only to “time order at the same 
point”. The expression “the same point” is prima facie meaningless in Relativity, but Reichenbach 
(1924), §5, explains that by a “real point” he means the worldline of a material particle. 

4. Reichenbach (1928), p. 170. 

5. Reichenbach (1928), p. 170. The epithet “topological” is not a happy one, for—at least in a 
Newtonian or a relativistic universe—the relation defined above is not invariant under arbitrary 
homeomorphisms of spacetime onto itself and therefore is not a topological relation. 

6. Reichenbach (1928), p. 171. See also the remarks in Reichenbach (1924), after Theorem 9. 

7. Reichenbach (1928), p. 172; (1958), p. 146. 

8. Grinbaum (1973), p. 29. Griinbaum’s use of “metrical” in this context is no less 
idiosyncratic than Reichenbach’s use of “topological” (see note 5). The paradigm of “metrical 
simultaneity” is Einstein's distant simultaneity in an inertial frame (Section 3.2), which is 
notoriously non-invariant under spacetime isometries (Lorentz point transformations) and 
therefore is not a metrical relation between “events” (worldpoints). 

9. Reichenbach (1924), Definition 2. 

10. Let ¢’ denote the arrival time of the first bouncing signal sent from A to B at ¢. Ina 
Newtonian world, t = t’. Ina relativistic world, # t' unless A and Bcoalesce or collide at t. (t < ¢’ 
if, as one normally assumes, the clock at A runs “forward”.) This is what we mean by saying that 
there is “a finite limiting velocity for causal propagation”. 

11. Note that in eqn. (7.1.1), if t4 = to, the choice of ¢ is irrelevant to the value of tg (which is 
then always equal to tg). This confirms that, as Adolf Griinbaum has repeatedly emphasized, 
Reichenbach’s rule for the conventional definition of simultaneity is effective only if t4 ¥ to, i.e. if 
there is “a finite limiting velocity for causal propagation” (note 10). If no such finite limiting 
velocity exists, the negative causal criterion is by itself sufficient to establish a transitive relation of 
simultaneity between events. 

12, Reichenbach (1924), Definition 8; (1928), p. 151. 

13. Cf. Arzeliés (1966), pp. 217-25. 

14, Reichenbach in effect never attempts to apply the rule beyond this narrow scope. However, 
he did not introduce inertial motion and inertial frames right at the beginning of his axiomatics 
because he was trying to show how close one can get to defining them by means of light signals and 
clocks alone, without resorting to metre rods. 

15. Eddington’s proof is not even mentioned by Reichenbach (1928), pp. 158 ff., where he 
discusses—and dismisses-—synchronization by clock transport in Special Relativity. 

16. See Eddington (1923), §§4, 11. Philosophers took an interest in synchronization by 
infinitely slow clock transport after Bridgman explained it in his Relativity Primer (Bridgman 
(1963), p. 65) and Ellis and Bowman (1967) based on it their case for the non-conventionality of 
standard synchronism in Special Relativity. See, for instance, Griinbaum (1969), Salmon (1969), 
Van Fraassen (1969a), Janis (1969), Ellis (1971), Bowman (1977). 
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17. In his discussion of ¢-Lorentz transformations, Winnie (1970) ignores two spatial 
dimensions. This can be done if the two charts involved in the transformation are matched (in the 
sense of p. 57), and the Reichenbach functions associated with each of them has its maximum or 
minimum value in the direction of the only spatial axis considered (which is also the direction in 
which the frame of one chart may be moving in the frame of the other). 

18. Giannoni (19784), p. 26, expands the set of Winnie transformations to form a group—which 
we may call the WG-group—by eliminating the causality condition (7.1.2) on Reichenbach 
functions. In this way, the rationale of Reichenbach’s rule, which was to display the whole 
spectrum of transitive simultaneity relations which are compatible with natural “topological” 
simultaneity (coupled, eventually, with the Law of Inertia), drops out of sight. I do not mean this as 
a criticism of Giannoni, who explicitly disassociates his version of conventionalism from 
Reichenbach’s (ibid., p. 18). 

19. Reichenbach (1928), p. 151, suggests as much when he says that standard synchronism is 
“essential” (wesentlich) for Special Relativity. But the theory, as we know, can very well do without 
standard or any other particular kind of synchronism if it is stated in a generally covariant form. 

20. P. Mittelstaedt (1977) forcefully argues this point. 

21. See on this point M. Friedman (1977); also the illuminating article by J. Earman (1972a). 

22. Hawking and Ellis (1973), pp. 198-201. 

23. Robb (1914) avoided the term “simultaneity” for philosophical reasons. As he says on p. 6: 
“According to the view here put forward, the only events which are really simultaneous are events 
which occur at the same place.” Mehlberg (1937), on the other hand, demands a definition of 
simultaneity capable of providing “a necessary and sufficient condition of this relation, not just a 
convenient (commode) criterion” (p. 171). The definition (D3) given on pp. 169 f. is similar to the 
one in Mehlberg (1966) which I reproduce in the main text. 

24. Added in proof: An English translation of Mehlberg (1935/37) is now available in volume I of 
Mehlberg, 7ime, Causality, and the Quantum Theory (Dordrecht: Reidel; ISBN 90-277-0721-9), 

25. Before closing the present section I believe I should mention the unceasing stream of 
proposals for establishing “true” simultaneity by means of signals of known speed. The 
proponents reason that if a signal is emitted at time t from a point P and it is known to travel with 
speed v to a point Q, at a distance d from P, the signal will arrive in Q at time t + d/v. They forget 
however that in order to specify a speed we must refer it to a definite space and time. Thus, for 
example, the vacuum speed of light c whose latest best value in metres per second is printed in every 
handbook of physics is referred to the relative space of an arbitrary inertial frame and to the time 
defined in it by standard synchronism. (Relatively to a time defined by non-standard synchronism 
in an inertial frame the speed of light in that frame is obviously a non-constant function of 
direction.) The manner how the definition of time is implicit in an experimental design for the 
measurement of light velocity can be quite subtle. Thus, the fairly sophisticated method of Fung 
and Hsieh (1980) rests on the double assumption that “a wave front is everywhere perpendicular to 
the direction of propagation” and that “for a parallel beam all wave fronts propagate in a single 
direction at the same speed” (p. 655), two premises whose truth-value is evidently not invariant 
under arbitrary transformations of the time coordinate function (or even under transformations 
restricted to inertial time scales). Other recent proposals of this type are aptly criticized by 
P. Ohrstrom (1980). W. Salmon (1977) surveys the established methods of measuring the speed of 
light, showing that when they do not presuppose a definition of synchronism, they actually 
measure the so-called “two-way” speed of light—i.e. the quotient between the distance travelled by 
a bouncing light signal and the travel time measured by a clock at the source. I am not altogether 
satisfied with Salmon’s analysis of Roemer’s method. Roemer’s idea can be summarized as follows. 
A distant observer at rest relatively to Jupiter will see any given Jovial moon hide behind the planet 
at fixed intervals, with constant frequency v. For a terrestrial observer, the frequency increases 
(decreases) by Av when Jupiter is approaching (receding from) the earth. The value of Av depends, 
according to the classical Doppler formula (which Roemer of course did not know but is 
nevertheless implicit in his work), on the actual period 1/v of the Jovial moon, the relative velocity u 
of Jupiter and the earth, and the (one-way) speed of light c. Setting 1/v equal to the annual mean of 
the observed period we can readily compute c from the equation c/(c + u) = (v + Av)/v, provided 
that we know u. Salmon says that “to ascertain the one-way speed of light across the earth’s orbit” 
Roemer had to assume “that the clocks used to time the eclipses of Jupiter’s satellite continue to 
tell the same time as they travel from one end of the orbit to another” (p. 264). But the truth is that 
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in order to obtain the required data Roemer had to time only a series of 40 consecutive eclipses of 
the Jovial moon—lasting, all in all, some 10 weeks—on two different occasions when the distance 
between Jupiter and the earth was, respectively, increasing and decreasing. The clock he used for 
this purpose might, for aught we know, have stopped and had its annual cleaning between the two 
series of observations. (Indeed, with better instruments it would suffice to measure on each 
occasion the period of approximately 42.5 hours from one eclipse to the next.) Standard 
synchronism enters into Roemer’s calculation through the value assigned to the velocity u with 
which Jupiter recedes from the earth. 


Section 7.2 


1, “Freie Erfindungen des menschlichen Geistes”—Einstein (1934), p. 180. Einstein proceeds, 
on p. 182: “The scientists of [the 18th and 19th centuries] were usually possessed with the idea that 
the fundamental concepts and the fundamental laws of physics were not, in a logical sense, free 
inventions of the human spirit, but could be derived from experiments by ‘abstraction’—e. in a 
logical way. A clear recognition of the erroneousness of this notion came only with the general 
theory of relativity, which showed that one could account for the relevant set of empirical facts in a 
more satisfactory and complete manner on a foundation (Fundament) quite different from 
Newton's. But quite apart from the question of superiority, the fictive character of the foundations 
(Grundlagen) is perfectly evident from the fact that two essentially different foundations can be 
produced that largely agree with experience. This proves, at any rate, that every attempt at a logical 
derivation (einer logischen Ableitung) of the fundamental concepts and the fundamental laws of 
mechanics from elementary experiences is doomed to failure.” (Unless Einstein was fighting straw 
men, “‘logische Ableitung” in the last sentence can only mean “induction”, not “deduction’’, as 
the English translator S. Bargmann rendered it.) The preceding passage occurs in the same lecture 
of June 10th, 1933, from which the Einstein motto of the present book is taken. The latter was 
Einstein’s answer to the question: “If, then, it is true that the axiomatic foundation of theoretical 
physics is not inferred (erschlossen) from experience, but must be freely invented, may we hope to 
find the right way at all?” (ibid., p. 182). 

2. Kant (1787), p. xii. 

3. Kant (1781), p. 231; (1787), p. 283. 

4. Kant (1787), pp. 145 f.; (1783), p. 111 (Ak., IV, 318). 

5. Kant (1790), p. 124 (Ak., VIII, 249). 

6. The five items listed above correspond or are entailed by the Kantian principles of the 
Analogies of Experience (1, 2, and 3), the Anticipations of Perception (4) and the Axioms of 
Intuition (5). (1) and (3) were upset by Special Relativity, (5) by General Relativity, (4) by the Old 
Quantum Theory, and (2) by Quantum Mechanics. 

7, Cf. Kant (1783), p. 101 (Ak., IV, 312). 

8. On the origins and evolution of conventionalism, see J. A. Coffa (1980). On Reichenbach's 
philosophy of geometry, see Kamlah (1977), Beauregard (1977), Nerlich (1976), B. Ellis (1963). We 
shall concentrate on Reichenbach’s version of geometric conventionalism, because it is generally 
believed that it flows from a careful study of Relativity. A more complete treatment of the subject 
would have to consider also the early works of Carnap (1922, 1924, 1925), and, of course, the 
conventionalist philosophy of geometry formulated, while Einstein was still a child, by Henri 
Poincaré (1887, 1891). The latter provides the backdrop for Einstein’s discussion of convention- 
alism in the well-known lecture, “Geometry and Experience” (1921a). Ido not wish to repeat what 
I have said about Poincaré in my (1978). However, the following remarks are new, and will 
illustrate and clarify the main point I have been trying to make above. Poincaré’s conventionalism 
is rooted in Felix Klein’s work “On the so-called non-Euclidean geometry” (1871, 1873), in which 
the classical non-Euclidean geometry of Bolyai and Lobachevsky—called by Klein “hyperbolic”— 
Euclidean or “parabolic” geometry, and a new type of non-Euclidean geometry, discovered by 
Klein and which he called “elliptic”, were united in a single one-parameter family of “projective 
metrics”, i.e. metric structures for projective space. Let P stand for complex projective 3-space. We 
denote by A the set of all real points of P; by B, the set A minus an arbitrary plane (the “plane at 
infinity”); by C, a connected subset of B bounded by a real quadric Q. As is well known, the 
automorphisms (projective permutations) of P map quadrics onto quadrics. The automorphisms 
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of P that map C and its boundary Q respectively onto themselves constitute a subgroup G of the 
full group of automorphisms Gry. Klein showed, following a suggestion of Cayley (1859), that the 
G-invariant properties and relations of the points of C satisfy, under a suitable interpretation, the 
propositions of hyperbolic geometry. Likewise, if Q’ denotes a purely imaginary quadric, the set of 
automorphisms of P that map Q’ onto itself is a subgroup G’ of Gry, whose elements also map A 
onto itself. The G’-invariant properties and relations of the points of A satisfy elliptic geometry. 
Finally, if Q” denotes a certain specific degenerate imaginary quadric, the set of automorphisms 
of P that map Q” onto itself is again a sub-group G” of Gyyy,, whose elements also map B onto itself. 
The G"-invariant properties and relations of the points of B satisfy parabolic, ie. Euclidean 
geometry. Suppose that, as was normally assumed in Poincaré’s time, all physical experiments and 
observations are contained in a finite open subset S of a Euclidean space. S can then be embedded 
in P by an embedding function i that maps straights into straights and satisfies the relation i(S) < C 
< Bc Ac P. Consequently, we can alternatively regard S as a part of a model of hyperbolic, 
Euclidean or elliptic space. Since, as Poincaré (1887), p. 216, duly noted, “the existence of a group is 
not incompatible with that of another group”, all those alternative views of physical space can 
coexist in peace and none is truer than the other. In my opinion, the foregoing argument 
conclusively proves that the choice of one of the said geometries as physical geometry is a purely 
conventional matter, provided that physical space is identified with S. This identification 
presupposes that (i) the locus of bodies has the topological and affine properties taken for granted 
in Newtonian kinematics, and (ii) its global structure, beyond the range of actual experiment and 
observation, is physically irrelevant. Prior commitment to (i) and (ii) sets the stage for the 
conventional choice of a physical geometry in accordance with the Cayley-Klein theory of 
projective metrics. But the prerequisites of such a choice are no longer there if, as in General 
Relativity, other foundational commitments displace (i) and (ii). Poincaré claimed later than even 
the topology of physical space is conventionally fixed. Since he never successfully identified the 
unstructured carrier of alternative physical topologies nor stated the terms of reference for 
choosing between them, I conclude that Poincaré’s topological conventionalism—in contrast to 
his metrical conventionalism—involved a confusion between “worldmaking” commitments and 
intramundane choices. 
9. M. Schlick (1918), p. 352. 

10. Reichenbach (1920), p. 48. 

11. Reichenbach (1920), p. 38. Reichenbach adds, ibid., pp. 38 f.: “It is useless to try to consider 
the individual perceptions as the defined elements of reality. The content of each perception is far 
too complex to serve as an element of coordination. [. . .] Before we coordinate a perception we 
must introduce order in it, by ‘distinguishing what is relevant from what is irrelevant’, But such 
distinction already involves a coordination presupposing certain equations or the laws expressed 
by them.” 

12. Reichenbach (1920), p. 40. 

13. Ibid., p. 50. Kant’s words are from Kant (1781), p. 93/(1787), p. 126. 

14. Reichenbach (1920), p. 40. 

15. Ibid., p. 41. 

16. [bid., p. 85. 

17. Ibid., p. 84. The change of constitutive principles in the course of history gives rise to an 
interesting philosophical problem that is boldly tackled by Reichenbach. How can experience 
refute the constitutive principles on which it rests? Moreover, how can experience built on discarded 
principles be of any use in the search of new constitutive principles? The first question is fairly easy to 
answer. Rejection of constitutive principles does not ensue from a conflict between them and the 
experience built in accordance with them, but follows upon the failure of such experience to 
materialize. If the description of perceived realities in terms of the chosen principles of 
coordination turns out to be inconsistent, we ought not to say that the principles clash with 
experience, but rather that experience itself is falling apart. Such situations can often be redressed 
by redescribing part of the available sense evidence as hallucinatory (or as a forgery). If this way 
out is closed, the principles of coordination must be revised, not because they are contradicted by 
experience, but because they have proved unable to continually constitute experience. It is much 
harder to find a satisfactory reply to the second question. Its inherent difficulty has led some 
contemporary philosophers to view scientific revolutions and conceptual change as essentially 
irrational processes, which are set in motion by the shortcomings of earlier science but cannot be 
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guided by former experience. Reichenbach’s solution is based, he says, on the method used by 
Einstein for revising the conceptual system of Newtonian science after its experimental 
breakdown. If the coordination between scientific concepts and perceived realities could ever be 
exact it would certainly be absurd to test new principles in the light of experiences interpreted 
according to the old principles which the new ones are expected to replace. But the coordination of 
concepts and reality is always merely approximate. As technical improvements increase attainable 
precision or widen the scope of observation, familiar principles that have long provided a 
satisfactory framework for the description of nature begin generating inconsistencies. On the 
other hand, the same principles that are shown to be incompatible with one another when a high 
degree of approximation is demanded or when the whole range of values of the relevant physical 
variables is taken into account, can still yield a fairly consistent description of phenomena if a 
greater margin of imprecision is allowed or a narrower range of such values is admitted for 
consideration. Progress in the formulation of ever more suitable principles of coordination can 
therefore be achieved by successive approximations. Reichenbach’s solution of the problem of 
conceptual change is satisfactory as far as it goes, but one may wonder whether it can encompass 
truly radical departures from established scientific tradition. In what sense can we say that 
Aristotelian physics approximates Newtonian dynamics? Or that Keplerian ellipses converge to 
Ptolemaic epicycles in the limit of naked eye astronomy? Indeed, if scientific change could occur 
only by successive approximations, science would be bound forever to the arbitrary—perhaps 
genetically conditioned—choice of an initial set of constitutive principles and no genuine scientific 
revolutions could take place. 

18. Reichenbach (1924), p. 5. 

19. “Theorie des taglichen Lebens”—Reichenbach (1924), p. 6. 

20. Strictly speaking, Definition 9 cannot even be said to coordinate the class of light rays with 
the concept of a straight line, for it says only that the former is contained in the extension of the 
latter, not that it is equal to it. It is therefore hard to understand why the statement is termed a 
“definition”. But then, logical neatness is not the main strength of Reichenbach’s axiomatics. 

21. Reichenbach (1928), p. 23. 

22, Reichenbach (1928), p. 23. 

23. But is it really a coordinative definition? The reference to temperature should make us wary. 
“At O°C” can indeed be explained in terms of melting ice at sea level, but reference must then be 
made to the purity of the water and the normality of the atmospheric pressure. A good deal of 
theory went also into the design of the Paris rod. The current SI metre, equal to 1,650,763.73 times 
the wavelength in vacuo of the radiation corresponding to the transition between the 2p, and the 
5d, levels of an atom of krypton 86, is of course firmly anchored to contemporary theoretical 
physics. 

24, “Unter Kraft verstehen wir ein Etwas, das wir fiir eine geometrische Verdnderung 
verantwortlich machen”—Reichenbach (1928), p. 38. 

25. Reichenbach (1928), p. 21. Universal forces were introduced as “forces d’espéce X”’ in 
Reichenbach (1922), pp. 34 ff. See also Reichenbach (1924), p. 68, where he introduces the terms 
“metrical force” (for forces of type X) and “physical force” (for all other forces). 

26. Reichenbach (1928), p. 32. Note that this definition does not simply coordinate the concept 
“rigid body” with “this thing here”. We never come across a solid which is not subject to 
differential forces or from which the influence of such forces has been already—i.e. before we 
approach it with our theory-laden views—“corrected away”. 

27. Reichenbach (1928), p. 49. 

28. Reichenbach (1928), p. 293: “In der Gravitation [liegt] der Typus einer universellen Kraft 
[vor].” Cf. Reichenbach (1922), p. 38: “La gravitation, qui jusqu’ici avait été concue comme toutes 
les autres forces, est précisément, si l'on y regarde de plus prés, une force d’espéce X.” Reichenbach 
(1924), p. 68: “Zu den metrischen Kraften gehort auch die Gravitation; darin besteht der Sinn der 
Einsteinschen Gravitationstheorie.” 

29. Reichenbach (1928), pp. 293 f. 

30. Einstein (1956), p. 79. i 

31. Reichenbach must have had some inkling that General Relativity does not simply do away 
with the force of gravity, for right after the passage quoted on p. 236 (ref. 29) he proclaims 
nevertheless that “we must ascribe to the gravitational field the physical reality of a force field”, 
namely, “the force field which we regard not as the cause of the perturbation of geometrical 
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relations, but as the cause of geometry as such”. For “even if we do not introduce a force to explain 
a deviation of a measuring instrument from some standard geometry, we do still invoke a forceasa 
cause for the fact that there is a uniform concordance of all measuring instruments” (Reichenbach 
(1928), p. 294). I dare say that in this text Reichenbach’s use of the word “force” has lost the last 
remnant of precision. 

32. As far as I can tell, the idea is due to Poincaré (1895), who bid us imagine a world enclosed in 
a great sphere of radius R, in which all solids have the same coefficient of expansion, thermal 
equilibrium is instantaneously reached, and the absolute temperature at distance r from the centre 
is proportional to R? —r?. 

33. Russell (1959), p. 39. Russell referred specifically to his own earlier Essay on the Foundations 
of Geometry (1897), in which he had asserted that the free mobility of rigid bodies was a condition 
of the possibility of physical experience. The pervasive influence of Helmholtz’s view was indeed so 
great that 13 years after the discovery of the field equations, Einstein himself characterized 
geometry as “‘la science des possibilités de deplacement des corps solides”, and added that in 
General Relativity the metric tensor defines “le mouvement des solides librement déplacables, en 
Vabsence d’effets électromagnétiques” (Einstein (1928c), p. 164; I thank Adolf Griinbaum for 
calling my attention to this passage). 

34, Hugo Dingler had repeatedly pointed out that the manufacturers of precision instruments 
assume the validity of Euclidean geometry and control their products by its rules. Likewise, 
surveyors correct their observations by Euclid’s theorem about the sum of the three interior angles 
ofa triangle. According to Dingler, preference for Euclidean geometry was mandated by its unique 
position among the proper Riemannian geometries of constant curvature (the others involve an 
arbitrary parameter). On Dingler’s philosophy of geometry, see Torretti (1978b) and the references 
listed there. 

35. Einstein to Barnett, June 19th, 1948; quoted by J. Stachel (1977), p. ix. Einstein (1928), 
pp. 164 f. had already made the same point. 

36. Onempiricism, apriorism and conventionalism in 19th century philosophy of geometry see, 
for instance, Torretti (1978a), Chapter 4. Conventionalism had at the time a clear advantage over 
the other two. Apriorism, when not shielded by ignorance, had to make such concessions to the 
plurality of geometries, that the “pure form of externality” became increasingly vague. Empiricists, 
on the other hand, were unable to explain the leap from blurred data to neat mathematical 
structures, or to justify the collection of empirical data on geometry by means of geometrically 
characterized instruments and processes. 

37. Reichenbach (1928), p. 42. 

38. Cf. F.A. E. Pirani (1956). Pirani’s method for measuring the curvature components rests on 
a theorem in differential geometry known as Jacobi's equation or the equation of geodesic deviation. 
Take any n-manifold M(n 2 2), with a symmetric linear connection that determines the covariant 
differential operator V and the curvature R. Consider a family of geodesics in M, indexed by a 
parameter v, which forms a congruence in a submanifold M’ < M, diffeomorphic with R?. We 
assume that the indexing by the parameter v is so contrived that there is a chart defined on M’, with 
coordinate functions wu and v, such that each geodesic of the family is a parametric line of u (with u 
varying along it as an affine parameter), while v assigns to each point of M’ the index of the geodesic 
through that point. Set U = ¢/¢u, V = 0/¢v. Since the connection is symmetric (i.e. torsion free), 
VyV-VyU = [U, V], by (C.6.6.). But [U, V] = 0 (p. 262). Consequently, by (C.6.7), 


Vu Vu = VyVyU = VyVyU + RU, V)U 
By the definition of geodesics, (C.7.1), VyU = 0. We thus have that 
VyVyV = R(U,V)U (x) 


which is Jacobi’s equation. In terms of the components of U, V and R relative to an arbitrary chart 
of M, Jacobi’s equation reads: 


@Visau? = Ri,,UAUEV" (#x) 


If we are dealing with timelike geodesics in a relativistic spacetime we can choose proper time as 
the affine parameter reflected by u. Let two particles A and B describe two nearby geodesics of the 
family, and refer equation (**) to a local Lorentz chart x based at A. Then, x° = u, 
V° = éx°/év = Gu/dv = 0, and, if B is very near A, the space components V* = @x*/év of V along 
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A’s worldline are proportional to the space coordinates of B in the local inertial frame. Hence, the 
quantities ¢?V*/Cu? measure the acceleration of B with respect to 4. The following, mutually 
complementary explanations of geodesic deviation and its physical significance can be useful as an 
introduction to Pirani’s paper: Clarke (1979), pp. 74-76; Dodson and Poston (1977), pp. 453 f.; 
Weinberg (1972), pp. 453 f.; Misner et al. (1973), pp. 265-75 (nicely illustrated); Synge (1960), pp. 
19-25, 57-64. 

39. The normal scientific practice of testing a hypothesis by methods that partly presuppose it is 
studied at length, under the suggestive name of “bootstrapping”, by Clark Glymour (1980). 

40. The main primary sources for the study of Griinbaum’s philosophy of geometry are 
collected in Griinbaum (1968), (1973). Grinbaum (1950, 1952, 1953) give us some insight into the 
philosopher's original motivation. The important essay, Griinbaum (1977), contains much that is 
completely new. The reader may also wish to consult Grinbaum (1967), a book on Zeno of Elea. 
Among the numerous discussions of the subject, I find the following to be particularly 
illuminating: Putnam (1963), Earman (1970), Glymour (1972a), Stein (1977b). See also Massey 
(1969), Van Fraassen (19695), Fine (1971), Friedman (1972), Horwich (1975), Quinn (1976), Nerlich 
(1976), pp. 155-85, Angel (1980), pp. 236-44. 

41. Griinbaum (1970), pp. 488-94; (1973), pp. 468-74. What distinguishes R-metrics from other 
cross-sections of the bundle of symmetric (0, 2)-tensors over a manifold S is that their range cannot 
include “degenerate” tensors. (A symmetric (0, 2)-tensor tata point Pe Sis said to be “degenerate” 
if there isa non-zero vector vu € Sp such that ¢(v, w) = 0 for every we Sp. Note that according to this 
definition a “non-degenerate” symmetric (0, 2)-tensor at Pe S is an inner product on Sp; see p. 303, 
note |. 

42. Griinbaum (1962), p. 413; (1963), p. 10; (1968), p. 13; (1973), p. 10. The following passage 
precedes the above, and may contribute to clarify it: “There is no intrinsic attribute of the space 
between the end points of a line-segment AB, or any relation between these two points themselves, 
in virtue of which the interval 4B could be said to contain the same amount of space as the space 
between the termini of another interval CD not coinciding with AB. Corresponding remarks apply 
to the time continuum.” See also note 44, 

43. Griinbaum (1970), p. 518; (1973), p. 498. 

44. Griinbaum has often linked his reasons for denying that space and time, in and by 
themselves, are metric structures, to Riemann’s distinction between discrete and continuous 
manifolds. Riemann said that the quantitative comparison between different parts of a manifold 
(distinguished from the rest of it by a characteristic or a boundary) “is made by counting in the case 
of discrete quantities, and in the case of continuous quantities by measurement”. He further noted 
that “measurement consists in a superposition of the quantities to be compared, and therefore 
requires some means of transporting (forttragen) one quantity as a measure of the other” 
(Riemann (1867), p. 9). According to Riemann, this implies that “while in a discrete manifold the 
principle of metric relations is already contained in the concept of the manifold, in a continuous 
one it must come from somewhere else” (ibid., p. 23). Commenting on these ideas, Grinbaum 
(1968), pp. 216 f., bids us consider “an interval 4 B in the mathematically continuous physical space 
of, say, a given blackboard, and also an interval 7, 7, in the continuum of instants constituted by, 
say, the movement of a classical particle’. Contrary to what would be the case in a discrete 
manifold, “neither the cardinality of AB nor any other property built into the interval provides a 
measure of its particular spatial extension, and similarly for the temporal extension of 7) 7,. For 
AB has the same cardinality as any of its proper subintervals and also as any other nondegenerate 
interval interval CD. Corresponding remarks apply to the time interval 77). If the intervals of 
physical space or time did possess a built-in measure or ‘intrinsic metric’, then relations of 
congruence (and also of incongruence) would obtain among disjoint space intervals AB and CD on 
the strength of that intrinsic metric. And in that hypothetical case, neither the existence of 
congruence relations among disjoint intervals nor their epistemic ascertainment would logically 
involve the iterative application and transport of any length standard. But the intervals of 
mathematically continuous physical space and time are devoid of a built-in metric. And in the 
absence of such an intrinsic metric, the basis for the extensional measure of an interval of physical 
space or time must be furnished by a relation of the interval to a body or process which is applied to 
it from outside itself and is thereby ‘extrinsic’ to the interval. Hence the very existence and not 
merely the epistemic ascertainment of relations of congruence (and of incongruence) among 
disjoint space intervals AB and CD of continuous physical space will depend on the respective 
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relations sustained by such intervals to an extrinsic metric standard which is applied to them. Thus, 
whether two disjoint intervals are congruent at ail or not will depend on the particular coincidence 
behavior of the extrinsic metric standard under transport and not only on the particular intervals AB 
and CD. Similarly for the case of disjoint time intervals 7, 7, and T, 73, and the role of clocks.” 
These considerations, which, if true, would evidently apply also mutatis mutandis, to the 
continuous spacetime manifold of Relativity, are sufficient, according to Griinbaum, to establish 
the conventionality of geometry on solid grounds, for “an extrinsic metric standard is self- 
congruent under transport as a matter of convention and not as a matter of spatial fact, although 
any concordance between its congruence findings and those of another such standard is indeed a 
matter of fact” (Griinbaum (1968), p. 217). 

45. Griinbaum (1968), p. 209. 

46. Griinbaum (1968), p. 209. Where I insert “[GGC]” Griinbaum writes “Riemann’s 
doctrine”. Cf. note 44. 

47, Let M be an n-manifold. By a foliation F of M, with codimension g (1 < q <n), I meana 
family of path-connected subsets of M called leaves, such that (i) each point Pe M lies in one and 
only one leaf of F,and (11) ifa given L € F contains a particular P € M, there isa chart x of M defined 
on a neighbourhood U of P whose last q coordinate functions are constant on L 4 U. It follows 
that each leaf of F is an (n — q)-submanifold of M. A foliation of spacetime into hypersurfaces is 
therefore a foliation with codimension 1; a congruence of curves in spacetime is a foliation of 
spacetime with codimension 3. 

48. Note that Robertson—Walker spacetimes are not usually handled in this way. Instead of a 
single cosmic space of worldlines (the parametric lines of t) with line element ds, = do, we 
consider each fibre of 1 as a distinct space (of events), with its own time-dependent line element 
ds, = S(t)do. The choice between these two approaches is indeed conventional for, as Eddington 
(1933), p. 90, remarked, “the theory of the ‘expanding universe’ might also be called the theory of 
the ‘shrinking atom’.” But no matter which approach we prefer, the space metrics ds, are not 
arbitrarily postulated, but result from the spacetime metric. 

49. Nerlich (1976), p. 184; cf. ibid., p. 129. In the “Reply to Hilary Putnam” from which the last 
two indented quotations (refs. 45 and 46) are taken, Griinbaum considers what we may call “local” 
or “restricted” space metrics, dependent on a particular spacetime chart x, defined on a region U. 

_A “space” is carved out of U by identifying all worldpoints whose space coordinates by x!, x? and 
x? are identical. If the spacetime metric is expressed, relative to x, by (ds*)? = g,;dx‘dx’, it is 
arguably appropriate to set (ds3)? = y,gdx7dx*, where yop = Gap ~ (Joa9os/Goo): (See, for instance, 
Moller (1952), pp. 237 ff.; Arzeliés (1961), pp. 70 ff.; Landau and Lifschitz (1962), pp. 272 ff.) The 
physical significance of such a ¢hart-dependent line element is of course very doubtful and it is 
altogether misleading to describe y,, as “the spatial metric tensor”. However, what really matters 
for our evaluation of GGC is that ds, is uniquely determined by ds, for any given x, i.e. for any 
given space carved out of space-time in this way. = 

$0. Griinbaum (1977), p. 342. Here and in the following quotations I substitute Latin letters for 
indices ranging over {0, 1, 2, 3}. 

51. Griinbaum (1977), pp. 342, 343. 

52. Griinbaum (1977), pp. 344 f. Like Stein (1977), p. 380, 1am unable to understand how an 
open-ended set can be canonical. 

53. Griinbaum (1977), p. 345. 

54. Griinbaum (1977), p. 346. 

55. Griinbaum (1977), p. 356. 

56. Griinbaum (1977), p. 351, notes that his thesis “that the g; ; fields of NE-GQ worlds acquire 
geometrical significance through the mediation of external entities like the FCS” enables us to 
“accomodate philosophically” some physical theories that endow spacetime with two metrics, 
realized by two different FCS. Thus, in a modification of General Relativity proposed by P. A. M. 
Dirac with a view to reconciling it with Quantum Mechanics, “macroscopic and atomic metric 
standards generate incompatible metric ratios among space-time intervals”, so that “only the 
macroscopic standards physically realize the scalar |g; ;dx'dx’|''? corresponding to the g;; field of 
Einstein’s equations”. But if a theory of this kind does actually postulate two distinct spacetime 
metrics, which are not to be regarded merely as a coarser and a finer approximation to one and the 
same field, but as two different fields, defined on every spacetime region, no matter how small, I do 
not see how one would be philosophically better served by distinguishing between the two 
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“physical” fields, which are discernible in principle at each worldpoint, and their respective 
“geometrical” counterparts, which exist only where they are “constituted” as such by the 
appropriate FCS. Since both “physical” fields have by hypothesis the “formal mathematical 
properties” of a metric, no philosophical problem raised by their actual coexistence at every 
worldpoint can be solved by the expedient of denying them the epithet “geometrical”. 


Section 7.3 


1. “Si quid intelligitur temporis, quod in nullas iam vel minutissimas momentorum partes 
dividi possit, id solum est quod praesens dicatur; quod tamen ita raptim a futuro in praeteritum 
transvolat, ut nulla morula extenditur. Nam si extenditur, dividitur in praeteritum et futurum: 
praesens nullum habet spatium” (St. Augustine, Confessiones, X1.15.20; cf. XI.21.27). 

2. Tread in the Cambridge Encyclopedia of Astronomy (1977), p. 24, that Rigel (8 Orionis) lies at 
a distance of 250 parsec~ 815 lightyears. Frederick’s fourth expedition to Italy began in 1166. 

3. Denote the said Lorentz chart by yand suppose that its frame moves past us, in the direction 
opposite to Rigel, with a speed v = 0.99999 (c = 1). If Rigel’s space coordinates in our own rest 
frame are x! = —815, x? = 0, x? = 0, and the time at which the light we see was emitted is 
x° = —815, we readily calculate, using (3.4.18), that y° = — 1.82. Referred to y, the light we see left 
the star less than two years ago (and Rigel itself is of course only 1.82 lightyears away). 

4. It was held indeed by Aristotle himself, a century before the Stoics. See De Interpretatione, 9. 
By the way, I do not believe that the ancient Stoics were truly guilty of logical determinism, but 
rather that they held, against Aristotle, that every proposition is either true or false, because they 
subscribed to strict determinism on physical grounds. 

5. Writing Lp for “it is necessary that p", we have that L(pv ~ p) does not logically imply that 
Lpv ~ Lp. 

6. Let A’ and B’ be the images of A and B by a Lorentz chart x whose time axis goes through A 
and B. Draw 4 through 4’ perpendicular to A’B’. All points on A are the images by x of events 
simultaneous with A in the frame of x. Choose any point C’ on A such that C’ B’ is the image by x of 
a spacelike geodesic. C’ is the image by x of an event C. Draw yu through 4’so that py is not 
perpendicular to C’ B’ and makes with / an angle complementary to 4 A’C’ B’. wis the image by x of 
the time axis of a Lorentz chart y in whose frame C is simultaneous with B. 

7. Putnam (1979), p. 198. A different sort of ambiguity turns up on p. 200. Putnam discusses 
the following allegedly binary relation: Rxy ifand only if x is simultaneous with y in the coordinate 
system of x. However, the definiens is not binary but ternary, for the first occurrence of x ranges 
over events, whereas the second must range over some other domain. (To speak of “the coordinate 
system of a given event” is a category mistake.) 

8. Rietdijk (1966), p. 342. 

9. This notion of smooth agreement can be made precise as follows. We say that Q is a 
neighbouring point of Pe # ‘if Q lies on the neighbourhood of P on which Expp™ ' (the inverse of 
the local exponential mapping) is a diffeomorphism. There is then one and (up to reparametriz- 
ation) only one geodesic 7 from P to Q which never leaves that neighbourhood. The local time 
orientations at P and Q agree smoothly if any future-directed vector at P is sent by parallel 
transport along 7 to future-directed vectors at each point in y’s range up to and including Q. 

10. It is clear that a time-orientable spacetime cannot contain a closed curve that is not time 
preserving. On the other hand, if our spacetime (W,, g) is not time-orientable there is, for each 
choice of a local time orientation at every point of W; a pair of neighbouring points P and Q at 
which the time orientations do not agree smoothly in the sense of note 9. But then, for every 
arbitrary fixed O€ W, a loop OPQO which is geodesic from P to Q cannot be time preserving. 

11. Remember that two curves y, 7’, which map the same interval J < R into the same manifold 
Ware said to be homotopic if one of them is continually deformable into the other, Le. if there is a 
continuous mapping f of I x [0, 1] into #, such that, for each te J, f(t, 0) = y(t) and f(t, 1) = y"(w). 

12. Let S be a path-connected topological space. A connected topological space Sis said to bea 
covering space over S with covering map f, if there is a continuous mapping f of S onto S, such that 
every PeS has a connected open neighbourhood U onto which each connected component of 
{~'(U) is mapped homeomorphically by f. S is a universal covering space over S if it is simply 
connected. Thus, for example, the plane R? is a universal covering space over the cylinder R x S, 
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with covering map (x, y)++ (x, (cos 2zy, sin 2zy)); note that each fibre of this covering map has 
infinitely many points. If S is a differentiable manifold, every covering space over S has a unique 
differentiable structure such that the covering map is smooth. A connected manifold M has a 
unique (up to isomorphism) universal covering manifold M. M is a principal fibre bundle over M 
whose structural group is the fundamental group of M. See Steenrod (1951), pp. 67 ff. 

13. The preceding discussion of time-orientability closely follows the excellent article by 
Geroch and Horowitz (1979), which also contains a careful yet not too technical discussion of the 
causal structure of relativistic spacetimes that I heartily recommend to the reader. The notion of 
topologically distinct yet observationally indistinguishable spacetimes is the subject of a deep 
mathematico-philosophical inquiry by C. Glymour (19725, 1977). D. Malament (19775) provides 
a luminous and insightful introduction to Glymour’s work. 

14. We referred to the Gédel solutions on p. 331, note 36; H. Stein (1970a) discusses them with 
special regard to their peculiar causal structure. On the Kerr solutions, see the references 
mentioned on p. 335, note 2, and (for a philosophical discussion) B. Kanitscheider (1979a), pp. 
168-72. The extraordinary properties of the Taub-NUT solution are discussed by Misner (1967). 
The recent treatise by D. Kramer et al. (1980) systematically studies and classifies the exact 
solutions to the Einstein field equations known to date, including those just mentioned. 

15. S. Hawking (1971), p. 396. 

16. S. Hawking and G. F. R. Ellis (1973), p. 189. 

17. S. Hawking and G. F. R. Ellis (1973), p. 197. 

18. R. Geroch and G. T. Horowitz (1979), p. 238. 

19. Note that the common practice of referring a particle horizon to a particle at a worldpoint, 
and not just to a worldpoint, is redundant and may be misleading. 

20. Roger Penrose has given a different definition of particle horizons, which is not tied to the 
existence of a congruence of timelike curves. A Penrose particle horizon is, like an event horizon, 
the boundary ofa set of worldpoints defined with respect to a particle or its worldline. The Penrose 
particle horizon of a timelike worldline y is the boundary of y’s chronological future J * (7). Note 
that the event horizon of y is in effect the boundary of y’s chronological past I~ (y) (because y’s 
causal and chronological past have the same closure and hence the same boundary). The event 
horizon of } “separates those events which are observable by an observer whose worldline is y from 
those events unobservable by him”, while the Penrose particle horizon of y “separates those events 
from which a particle with worldline y can be observed (in principle), from those events from which 
the particle cannot be so observed” (Penrose (1968), p. 189). A short reflection should make clear 
that a non-empty Penrose particle horizon will occur in every case in which the Rindler particle 
horizon is non-empty. (For the latter is a boundary of the particles which must remain concealed 
to a given observation, whereas the former is the boundary of the potential observations to whicha 
given particle must remain concealed.) However , in spite of the alluring simplicity and greater 
generality of Penrose’s concept, Rindler’s has prevailed in the literature. 

21. See, for instance, Hawking (1972); Hawking and Ellis (1973), Chapter 9. A particle outside a 
black hole cannot be reached by causal curves originating inside it. Thus, all events within the black 
hole lie beyond the horizon of every particle that does not fall into it. These statements do not hold 
if the energy contents of a black hole can “tunnel through” its boundary according to the laws of 
Quantum Mechanics, as in Hawking (1975)}—cf. the popular exposition by Hawking (1977). 

22, MacCallum (1971). 

23. Cf. Misner (1969b). 

24, Misner (1980), p. 225. Misner points out that A. H. Taub’s work on anisotropic Big Bang 
worlds shows that the “horizon paradox” we have described above is not merely a consequence of 
the ideally perfect isotropy of the Friedmann worlds, which could be avoided in a more realistic 
cosmological model. 


Appendix 


A. Differentiable Manifolds 


1. Gauss (1827), Riemann (1867). I have discussed Gauss’s methods and Riemann’s theory in 
Torretti (1978a), pp. 67-109. 
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2. A mapping f of a topological space onto another is a homeomorphism if (i) f is bijective 
(ie. one-one and onto), and (ii) both f and its inverse f~! map open sets onto open sets. A 
homeomorphism is thus an isomorphism of topological spaces. Two such spaces are said to be 
homeomorphic if there is a homeomorphism of one onto the other. 

3. The adjective real refers to the fact that the charts have their ranges in R", and so have real- 
valued coordinate functions. A complex manifold has complex-valued coordinate functions. More 
generally, one may replace R" in the above definitions by an arbitrary n-dimensional Banach space 
(ie. a normed vector space in which all Cauchy sequences converge). It is worth noting that a C*- 
differentiable structure can be defined on an arbitrary set S as follows: An n-chart is a bijective 
mapping ofa part of S onto an open subset of an n-dimensional Banach space; a C*-atlas A of Sand 
the respective maximal atlas Ama, are defined as above, with ‘n-chart’ substituted for ‘chart’; S is 
given the weakest topology that makes every n-chart in Aja, into a homeomorphism. 

4. The exact order of differentiability of a mapping is usually of no consequence in physics, 
because any C’-differentiable mapping of a C*-differentiable manifold into another can be 
approximated to any desired degree by a C'-differentiable mapping, if 1 <r < k < o. (Munkres 
(1966), p.41.) There is no significant loss of generality in confining our discussions to C *- 
differentiable manifolds, because any maximal C'-atlas of a manifold S includes a C *-subatlas. 
(Munkres (1966, p. 46.) On the other hand, the postulated order of differentiability of some 
unphysical auxiliary functions—e.g. a metric at infinity—can have significant physical impli- 
cations; see Ehlers (1980a), p. 284. And of course the difference between C'!-differentiability and 
mere continuity can be dramatic; this matters because even within the scope of a straight 
differential-geometric theory there are physical situations which one might wish to deal with by 
means of unsmooth continuous or merely bounded functions, and it is a moot question whether 
one can actually do so without dismantling the theory. See Taub (1979); Tipler, Clarke and Ellis 
(1980), pp. 156 f. 

5. That is: 


Hila) = lim (1/h) (fox (yy Mp EAS Mg) Kf gy os Ms oy Uy) 
h70 


where u = (u,,...,u,) is any point in the range of x. 

6. Proof: Let v = £,a,¢/¢x'. Then vx/ = a,. Hence, v is identically 0 only if a; = 0 for all 
indices j. 

7. This definition of vector field is equivalent to the one given on p. 260. Note that if V is a 
vector field on S, f-+V/ is a linear mapping of F (S) into itself, which is also called V in the 
hterature. The homonymy is justified insofar as the mapping of the linear space of linear 
endomorphisms of ¥ (S) into the module of vector fields on S that sends each such V to its 
namesake is a linear isomorphism. 

8. See, for instance, MacLane and Birkhoff, Algebra, p. 238. 

9. See, for instance, Y. Choquet-Bruhat et al. (AMP), pp. 117-21. 


B. Fibre Bundles 


1. Lie groups were defined on p. 288, note 12. The action “on the right” of a Lie group on a 
manifold was defined on p. 288, note 13. 

2. In other words, if a,be E, there is a géG such that ag = b if and only if x(a) = x(b). 

3. For a proof, see Kobayashi and Nomizu, FDG, vol. I, p. 52. 

4. Denote this mapping by f/f. If g, heG, aeE, seF, f(gh,(a,s)) = (agh,h~'g™'s) 
= fh, f(g, (a, s))). 

5. If A€GL(n, R) is a linear automorphism of R", A can be extended to a unique linear 
automorphism of the tensor algebra on R", which maps t? onto itself (for each value of q and r). 
Since t? (for each q and r) is constructed in a definite way from R, R" and the dual space (R")* = t, 
(the space of real-valued linear functions on R"), A extended is completely determined if we know 
its restrictions to R” and to (R")*. The former is of course A. We set the latter equal to (A7)~', ie. 
the inverse of the transpose of A. 
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C. Linear Connections 


1. Other alternatives are explained and compared by Y.Choquet-Bruhat et al. (1977), 
pp. 287 ff. 

2. Remember that the kernel ker f of a linear mapping f is the subset of its domain which f 
maps on zero. 

3. The p-linear mapping wp of Sp into V is determined by its value at each list of p vectors 
drawn from a basis of Sp. If n < p, all such lists are repetitious. 

4. The Lie derivative of a vector field along another was defined on p. 262. 

5. Kobayashi and Nomizu, FDG, vol. I, pp. 36f., Prop. 3.11. 

6. If i denotes the canonic inclusion of x~'(zq) into E, m-i is a constant mapping and 
(x' i), = 7, i, = 0. Hence i,,(V,) c ker 2,,. Since ker z,, is m-dimensional it must be identical 
with V,. 

7. As lifts are so to speak the key to our subject, let me propose another view of them. Since 
each space of horizontal vectors H, is n-dimensional we can think of it as the image of a linear 
injection o, of B,, into E,. a, is uniquely defined by the requirement that z,, ‘0, be the identity on 
B,,. Evidently, g, varies smoothly with q and is consistent with the right action R (so that for any 
9€G, oq, = Ryyq’ Gq). 7, May be said to lift the tangent space B,, to E,. The lift of the vector field X 
on B is the vector field X* on E given by q ++ o,(X,,). The uniqueness of X* is ensured by its 
definition. 

8. A rigorous proof of this fact will be found in Kobayashi and Nomizu, F DG, vol. I, pp. 69 f., 
Prop. 3.1. We can make it plausible as follows: Let X be a vector field on a neighbourhood of »’s 
range, which assigns to each point of the latter the tangent to y at that point. If X* is the lift of X 
and ¢ denotes the mapping ui+u+¢ of R onto itself, y-¢ is the restriction to ¢(/) of the maximal 
integral curve of X* through p. 

9. Kobayashi and Nomizu, FDG, vol. I, p. 142, Prop. 7.3. 

10. Kobayashi and Nomizu, FDG, vol. 1, p. 143, Prop. 7.4. 

11. See Kobayashi and Nomizu, FDG, vol. I, p. 76,example 5.2, and the remarks on Qand O on 
pp. 77 and 120. A study of the said example 5.2 will disclose the deep necessity underlying our 
seemingly artificial definitions (C.6.1) and (C.6.2). The “easily verified” fact mentioned by 
Kobayashi and Nomizu on p. 76, line 13, rests on the following: each pe x~'(P) < FS is equal to 
qg for some g in the structure group; q~' sends each reR" to the equivalence class 
[q.r]eSp¢ TS, which, if p = qg, is identical with [p,g~'r], the image of g~'r by p. 

12. Kobayashi and Nomizu, FDG, vol. I, p. 135, Prop. 5.3. 

13. Since this result is quite fundamental for General Relativity we may as well prove it. For any 
n-tuple (r',...,r")eR", the equations x'- y(t) = r't define a geodesic y such that y(0) = P. Since 
dx!-y/dt = riand d*x'-y/dt? = 0, (C.7.2)entails that I’, (P)r/r* = 0. As this holds for all n-tuples 
(rt...) Uh (P)+ Fi, (P) = 0. If Py, = Fh, it follows that F4,(P}= —4(P) = 0. 

14. For a proof, see Kobayashi and Nomizu, FDG, vol. I, pp. 79 ff., Prop. 6.1. 

15. Although I have hitherto taken for granted that the reader is acquainted with linear algebra, a 
definition of contraction might not be out of place here. If fis a q-linear function ona linear space V 
and g is an r-linear function on a linear space W, the tensor product { ®)g is the (q +r)-linear 
function defined on V7 xW" by 

(FOI) Ug Wis We) HS (Urs. UG (WW) (ve V, weW). 

It is clear that, for suitable multilinear functions f, g,h, f®(g@A) = (f@g)@h. As we know, a 
(q, r) tensor field on an n-manifold S can be regarded as a (q + r)-linear mapping of the module 
(¥°, (S))¥ x (¥°'(S))’ into its ring of scalars F (S). We shall say that a (q, r)-tensor field on S is 
simple if it is the tensor product of q vector fields and r covector fields on S. Obviously the module 
¥ 4(S) of (q, r)-tensor fields on S is spanned by its simple elements. Hence a linear mapping of 
¥°4(S) need only be defined on the latter. The contraction Ctr} of (q, r)-tensor fields on S with 
respect to their i-th contravariant and j-th covariant indices (1 <i <q,1 <j <r) is the linear 
mapping of ¥ #(S)into ¥ 47 | (S) defined as follows: for any vector fields on S, X,,.. .. X, and any 
covector fields on S, @,,..., @,, 


Ctr} (X,@. -- OX, ©, ©... @Qa,) = (Xj) (KX @..-@OXj-1@OXi41®.-- 
@X,B,®... Ov, O0j.1®..-Oay). 


RAG - L* 


350 RELATIVITY AND GEOMETRY 


It follows immediately from this definition that if A is, say, a (2, 2)-tensor field given, on the domain 
ofachart x, by A = A4,, (@/€x'@ @/ex!@dx* @d-x"), the contraction of A with respect to the first 
contravariant and the first covariant indices is given on the domain of x by Ctrj(A) 
= Ali, (0x*/Ax')(8/Ax/@dx") = A',,(0/0x4@dx"). The (,)-th component of Ctr} (A) relative to x 
is therefore equal to A‘, (summation intended over the repeated index i!). The contraction of 
tensors at a point is defined analogously. 


ADDENDA to page 302, note 34. On the other hand, Elie Zahar (1982), with whose reading of 
Poincaré’s conventionalism I substantially agree, maintains that Poincaré discovered Special 
Relativity independently in 1905 and that his philosophical insights contributed decisively to this 
achievement. A similar opinion is voiced in this series by Jerzy Giedymin (1982). In my review of 
Giedymin’s book (Didlogos, 40, November 1982), I reaffirm and further elaborate the views I put 
forward above, in Section 3.8. 
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INDEX 


A priori knowledge of physical world 232, 
233 
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Aether 37, 38, 39-47, 117, 290 n. 3, 291 n. 22, 
300 n. 9 
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Lorentz 42, 43, 291 n. 21; Michelson 
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327 n. 19 
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n.3 
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Cauchy surface 124, 215, 332 n. 44 

Causal automorphism 127 
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Causal space 123; stably — — 254, 337 n. 23; 


121; of set of 


strong 125, 253; topological 124 
Causal structure of relativistic space- 
time 192, 247f., 252-5 
Causal vector 124 
Causality and spacetime structure 123, 255 


Causality and time order 220 
Causality: formal vs. efficient 
Causally connectible 121 
Causally precedes 121, 192 


255 


Chart 257; Cartesian 14; global 257; i-th 
coordinate function 14, 256; i-th para- 
metric line 259. See also under Fermi 
chart, Galilei chart, Lorentz chart, 
Normal chart, Winnie chart. 

Chronological future 121 

Chronological neighbourhood 

Chronological past 121 
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Clock 12, 13, 52, 96 

Clock hypothesis 96, 327 n. 10 

Clock paradox 68f., 96, 296 

Clock synchronization: by clock transport 
226f., 338 n. 15, n. 16; by light signals 
53f., 223; determined by Reciprocity 
Principle 298 n. 4; Eddington on 226; 
Einstein on 53f.; Reichenbach on 
223ff; standard vs. non-standard 
226; symmetry of 294 n. 7; Thomson 
on 52, 294 n. 5; transitivity of 
294 n. 7 

Comoving coordinates 329 n. 24 

Compact topological space 329 n. 22 

Compass of inertia 200, 331 n. 34 

Conceptual change 108, 341 n. 17 

Conformal metrics 192 

Conformal structure of a manifold 192 

Conformally flat metric 313 n. 30 

Congruence of curves 28 

Congruence path-dependent in Weyl’s “pure 
infinitesimal geometry” 190 

Conjugate vector (in Minkowski — ge- 
ometry) 125 

Connected topological space 288 n. 12, 314 
n. 14 

Connection in a principal fibre bundle 270. 
See also Linear connection. 

Connection form 271 

Contraction of a tensor 349 n. 5 

Conventionalism 87, 231, 232. Geometric: 
Griinbaum 242 ff., 344-6; Poincaré 
283 n.4, 340 n.8; Reichenbach 
234 ff. See also under Simultaneity. 

Coordinate system see Chart. 

Coordinate transformation 15, 257 

—-—, Galilei 24-5; purely kinematic 
(pk) 29 

— —, Lorentz 56; homogeneous 72; math- 
ematical vs. physical sense 72; non- 
kinematic (nk) 57; orthochronous 72; 
proper 72; purely kinematic (pk) 61 f. 

— —, Mobius 76 

Coordinates: metrical meaning in Newtonian 
mechanics and Special Relativity 14, 15, 
51, 53, 54, 56; no immediate metrical 
meaning if gravity is present 147, 149, 
151f., 154 


124 


121, 192 


Coordination of concepts with 
(Reichenbach) 233, 234 

Coordinative definition (Reichenbach) 
342 n. 20, n. 23 

Corresponding States, Lorentz’s Theorem 
of 44f., 46f., 292 n. 26 

Cosmological constant 198 f., 205 

Cosmological Principle (Milne) 333 n. 24 

Cosmological redshift (in Friedmann uni- 


reality 


234, 


verse) 207f., 334 n. 27 
Cosmology, relativistic 197-9, 202-10, 
328-38 


Cotangent bundle 99 

Cotangent space 260 

Cotangent vector see Covector. 

Covariance see General covariance 

Covariant derivative 101, 155, 274-5 

— —, absolute 275 

— —, exterior 270 

Covariant differential 275 

Covariant differentiation commutative only in 
flat manifold 317 n. 15 

Covector 99, 260 

Covering space 346 n. 12 

Critical point 304 n. 6 

Cross-section: of a bundle 263; parallel — 271 

Curvature: associated with linear connec- 


tion 275, 277; associated with 
Riemannian metric 144f,; con- 
stant 145,238,333n.15;Gaussian 144, 


313 n. 4; (analogy of Gaussian curvature 
with gravity 146f.); of planecurve 313 
n. 3; sectional 145 

Curvature form of a connection 271 

Curvature scalar 210, 280, 281 (D. 5), 318 
n, 27 

Curve 6, 94; causal 124, 127; chronolo- 
gical 124; closed 124; closed 
causal 253; energy-critical 94; horis- 
motical 124; length-critical 95; repara- 
metrization of 259; tangent to 259 


Diffeomorphism 258 

Differentiability, order of—physical signifi- 
cance 348n. 4 

Differential 260, 261 

Differential form 99, 266~8 

Direct sum of vector spaces 297n. 5 

Distance 303n.1; in proper Riemannian 
manifold 95 

Determinism: —chronogeometrical 
logical 249 

Domain of dependence 

Doppler effect 70, 300 

Drag-along of covariant tensor field by 
diffeomorphism 165 


249-51; 


124 
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Ehrenfest paradox 314 n. 22, 315 n. 26 

Einstein field equations of the gravitational 
field 159, 160, 173, 281 (D. 6, D. 7); 
singled out among alternatives 160, 318 
n. 26, 323 n. 34, 332 n. 44; weak-field or 
linearized approximation 325 n. 35 

Einstein simultaneous 54 

Einstein spherical static universe 
329 n. 24 

Einstein summation convention 89 

Einstein tensor: proper 160, 323 n. 36; modi- 
fied 160, 318 n. 28; vanishing of covari- 
ant divergence 160, 318 n. 26 

Einstein time 54 

Electromagnetic conception of matter 
n. 18, 302 n. 5, 322 n. 19 

Electromagnetic energy tensor 

Electromagnetic field tensor 115, 307 n. 3 

Electromagnetic stress tensor 117 

Electromagnetic unit of charge 289 n. 4 

Electrostatic unit of charge 289 n. 4 

Embedding of a manifold into another 
n. 7; isometric 326 n. 7 

Energy, kinetic (Special Relativity) 111, 113 

Energy of moving massive particle in Special 
Relativity 70f. 

“Energy” of vector or curve in Riemannian 
manifold 94, 304 n. 5 

Energy tensor 119, 308 n. 11, n. 14; Con- 
servation law 120; matching law in 
General Relativity 158; energy tensor 
nota satisfactory representation of matter 
324 n. 11; et. of dust 172; of perfect 
fluid 120, 331 n. 40; plays same role as 
gravitational energy matrix in Einstein 
field eqns 174, 175; vanishing of covari- 
ant divergence implied by Einstein field 
eqns 160, 174, 175; vanishing of covari- 
ant divergence not a “pure conservation 
principle” 158 

EP (Equivalence Principle) 135. See also 
under Principle of Equivalence. 

Erlangen Programme 2, 26, 34, 283 n. 1, n. 3 
340 n. 8 

Event 22 

Expansion of the universe 203, 330 n. 27; due 
to spacetime geometry 207 

Exponential mapping 251, 278 

Exterior derivative 101, 267f. 


198, 204, 


290 


117, 172 


326 


Fermi chart 200, 319 n. 6 
Fermi derivative 330 n. 34: 
Fermi transport 201, 331 
Fernparallelismus 191, 327 n. 16 
Fibre 6, 288 n. 8 
Fibre bundle: principal 


263; associated 264 


390 INDEX 

Field equations and equations of motion: in 
classical field theories 176; in General 
Relativity 176 ff. 

Fitzgerald-Lorentz contraction 45-6, 85f., 
292 n. 32 

Flat (in affine space) 91 

Flat connection in principal fibre bundle 271 

Flat linear connection 277 


Flat Riemannian manifold or metric 145, 280 
Foliation 345 n. 47 
Force, differential (Reichenbach) 236 


—, impressed 8, 284 n. 5 
—, inertial 20, 287 n. 17 
—,innate 8, 284 n. 5 (see also Inertia, Mass) 
—, Newtonian 8, 18f., 107; represented by 


“same” vector relative to all inertial 
frames 19, 107 
Force, relativistic: Einstein's definition of 


1905 109f., 112, 306 n. 8, 307 n. 16; 
Planck’s definition of 1906 112 

—, universal (Reichenbach) 236-8, 342 n. 25, 
n. 28 

See also Four-force, Lorentz force, Relative 

force 

Foucault’s pendulum 194, 196 

Four-acceleration 105f., 306 n. 18 

Four-current density 114 

Four-force 114 

Four-force density 115 

Four-momentum 114 

Four-potential, electromagnetic 307 n. 3 

Four-tensor field 102; meaningless relatively 
to arbitrary frames 106, 154; ordinary 
tensor field uniquely defined by it 106f. 

Four-vector 86, 102; decomposition into 
spacelike and timelike parts 104, 305 
n. 14 

Four-velocity 105, 305 n. 16 

Frame 28; inertial 17-20, 28, 51, 74, 224, 
293 n. 2; rigid 14, 28; rotating 147-9, 
314 n, 22, 315 n. 25, n. 26; “stationary” 
(Einstein, 1905d) 51, 293 n. 2; uniformly 
accelerated 310 n.6 

Freedom in scientific theorizing 56, 231, 340 
n. | 

Friedmann universes 204-10, 212-13, 219, 
333-4; dismissed by Einstein (1922) 204, 
333 n.9; embraced by Einstein (1931) 
204f.; periodic 335 n. 9, 337 n. 26 

Friedmann’s equation 334 n. 31 

Friedmann’s hypotheses 205 

Future directed causal curve 124; vector 92 

Future distinguishing spacetime 217 

Future set 217 


Galilei chart 23; connection with inertial 
frames 29 


Galilei group 26, 65, 288 n. 12 

Galilei invariants of Newtonian spacetime 27 

Gauss’s Theorema egregium 144 

General covariance 152-8, 316 n. 2, n. 7, 319 
n. 4, 321 n. 2; judged impossible in gravi- 
tational theory by Einstein (1913/14) 
163-5; vindicated by Einstein 
(1915) 165-7 

Geodesic: affine 187; complete 211 f., 278; 
of linear connection 189, 277; 
Riemannian 94, 304 n.7 

Geodesic deviation 343 n. 38 

Geodesic equations 151 (5.4.7*), 277 (C.7.2), 
281 (D.8) 

Geodesic incompleteness 212; insufficient to 
characterize singular spacetimes 215f.; 
null-geodesic incomplete Lorentz metric 
conformal to a complete metric 336 
n. 10 

Geodesic Law of Motion 151, 315 n. 35, 316 
n. 36; implied under certain conditions by 
Einstein field equations 177 ff. 

Geometrical object 98, 104, 316 n. 1 

Geometry 1, 2 

—, elliptical 329 n. 24 

—, Euclidean 89, 285 n. 6, 286 n. 2 to Sect. 1.4; 
privileged according to Dingler 343 
n. 34 

-~, Minkowski 89 ff., 98, 121 ff; characterized 
by causal structure 125-8 

—, Non-Euclidean: Cayley-Klein theory 340, 
n. 8; of Minkowski spacetime 88, 89; of 
rotating frame 148 f., 315 n. 26 

— of Free Fall and Light Propagation, 4, 
192-4, 327 n. 22 

—, “Riemann” (spherical) 329 n. 23 

—, Riemannian see Riemannian manifold, 
Riemannian metric. 

—, spacetime 26ff., 91-3, 95-8, 194 

Geometry and causality 123, 255 

Geometry and Gravitation 138, 139, 140, 
142, 146f., 150-2, 194-6, 240, 241, 245 

Gravitation, theories of: 

Abraham 139-40, 312 
Einstein: main hypotheses 
324 n. 11; 1913 outline 143, 158 ff; 1915 
first theory 170-72; 1915 second 
theory 172f. 1915 final theory 173-5, 
237f., 322 n. 30 
Einstein—Grossmann 
Hilbert 175f. 
Lorentz 130f., 310n. 7 
Mie 142f., 313 n. 35, 322 n. 19 
Minkowski 310 n. 14 
Newton 31-4, 130, 191, 237 
Nordstrom 140-2, 312-13 
Poincaré 132-3, 310 n. 8 


139, 159, 240, 


158-62, 169, 196 


Gravitation, theories of: (contd, 
Reference to “Machian” theories 
Gravitational field equations: 
Einstein (discarded 1915 versions) 171 
(5.7.4), 173 (5.7.15) 
— (final 1915 version) 
(5.5.10) 
— (1917 version) 281 (D. 6); cf. 160 (5.5.11) 
Einstein—Grossmann 161 (5.5.12), 319 
n. 30 
Poisson 33 (1.7.4) 
Gravitational energy and momentum 183f., 
319 n. 6 
Gravitational energy matrix 158, 161, 171; 
called “energy tensor” (with sneer quotes) 
by Einstein 174 
Gravitational field strength components in 
Einstein’s theories of gravity 151, 158, 
189, 237, 317 n. 20 
Gravitational potential 
light 138, 140; 
metric 139, 151 
Gravitational radiation 181, 319 n. 6, 325 
n. 29, n. 34; and equations of motion 181 
ff.; propagation speed 181, 310n.6,n. 12 
Gravitational redshift 135, 138, 311 n.12,n.6 
Gravity: as a universal force (Reichenbach) 
236-8, 342 n. 28, n. 31; linked to proper 
time 138. See also Geometry and 
gravitation. 
Groups, semidirect product of two 26. See 
also Action of a group, Galilei g., Lie g., 
Lorentz g., Poincaré g., Velocity g. 


331 n. 42 


173 (5.7.17), of. 160 


7, 33; and speed of 
and _—_ space-time 


Guidance field (Fiihrungsfeld) 194 

Hausdorff topology 124f. 

Hole argument against general covari- 
ance 163-8; criticized by Hilbert 320 
n. 33 

Homeomorphism 348 n. 2 

“Homogeneity of spaceand time” 57,72, 74f. 


Homogeneous space 72, 296 n. 2 to Sect. 3.6 

Homotopic curves 346 n. 11 

Horismos, future 121; past 121 

Horismos relation 121 

Horizon: event 254f.; particle 254f.; 
Penrose particle 347 n. 20 

Hypersurface 261 


Ideal points of relativistic spacetime 183, 
216-18, 324 n. 10 

Inertia 8, 107, 135; compass of 200, 330 
n. 34; i. of charged particle moving in 
electromagnetic field 110; i. relative to 
masses, not “space” 197. See also under 
Mass. 


INDEX 391 
Inertial see under Force, Frame, Mass, 
Motion, Time-scale 
Inner product 303 n. 1; Minkowski 
(on R*) 91 


Integral curve of a vector field 260 
Interval, Minkowski: in R* 89;in.@ 90,92 
Invariant 26 


Jacobi’s equation 343 n. 38 


Kernel of a linear mapping 349 n. 2 


Langevin clock 52 

Law of Currents (Ampére) 35, 36 

Law of Gravity (Newton) 32ff. 

Law of Motion, Geodesic see Geodesic Law 


Law of motion in General Relativity 176 ff. 
Laws of Coulomb 35 
Laws of Motion (Newton): First — see 


Principle of Inertia; Second 9, 19, 29f., 
55, 107, 284 n. 7; Third 9, 19, 43, 51, 55, 
83, 107f., 287 n. 15 

Laws of physics and Relativity Principle 30f., 
54f., 65 

“Lemaitre universes” 333 n. 10 

Length contraction, relativistic 67f., 97f. See 
also Fitzgerald-Lorentz contraction 

Length of curve 94 

Length of moving rod 296 n. 1 to Sect. 3.5 

Levi-Civita connection 101, 188, 279; com- 
ponents 281 (D. 1) 

licf (local inertial comoving frame) 115 

Lie algebra 268 

Lie bracket 268 

Lie derivative 101, 261-2, 268 

Lie group 268, 269; defined 288 n. 12; ad- 
joint representation 269 

Lift 270, 349 n.7 

Light 37, 40-42, 49, 50, 51, 84, 137 f., 290 
n. 12, 312 n. 13; aberration of 41, 70, 
291 n.10, n.13, 300; deflection by 
gravity 135, 138, 173, 311 n. 7, 322 
n. 23; propagation of 312 n. 13, n. 14 

Light speed and gravitational potential 138, 
140 

Light speed as maximum signal speed 71 

Light velocity: in moving _ refracting 
medium 70 (see also Aether drag); 
measured by Roemer 339 n. 25 

Linear connection 272; complete 278; 
incomplete 212;metric 279;symmetric 
or torsion-free 188, 277; Weyl’s defi- 
nition 188. See also  Levi-Civita 
connection. 


392 INDEX 
Line element: of arbitrary Riemannian 
manifold 101; of surface 313 n. 4; 


Robertson-Walker 207; Schwarz- 
schild 335 n. 2; de Sitter empty uni- 
verse 329 n. 27; static Einstein universe 
329 n. 24 

Line of force (Faraday) 36 

Locally Minkowskian 145, 314 n. 13 

Lorentz chart 54, 56, 227, 296 n. 9; local 136, 
147; matched Lorentz charts 57,295 n.3 

Lorentz force 43, 113, 116, 323 n. 2 

Lorentz group 6, 62, 89, 283 n. 5, 295 n. 12, 
n. 15, 302 n. 29; general homo- 
geneous 280; homogeneous 6, 63-5, 
86, 295 n.13;  orthochronous 64; 
proper 64 

Lorentz metric 93,314 n. 14; compatible with 
linear connection 192; topology for set 
of Lorentz metrics on a given 4-mani- 
fold 337 n. 17 

Lorentz tetrad field 102 

Lorentz transformation: equations for match- 
ing charts (“coordinate systems in stan- 
dard position”) 61, 84; equations for 
non-matching charts with common 
origin 62; linearity of 57, 71-6, 297 
n.9, n. 10, 298 n. 3, 299 n. 9; of elec- 
tromagnetic field components relative to 
matching charts 109 

LP (Light Principle) 54. See also under 
Principle of Constancy of Light Velocity 


Mach’s Principle 196-202, 328 n. 7,n. 11, 331 
n. 43, n. 44; dismissed by Einstein 201, 
202; inimical to General  Rela- 
tivity 199-201; opposed to Equivalence 
Principle 200 

Machian effects in Einstein’s theories of 
gravity 196f., 200, 328 n. 11 

Machian solutions to Einstein field equa- 
tions 331 n. 43, 332 n. 45 

Manifold 2, 72f, 257ff, 348 n. 3; ana- 
lytic 288 n. 12; complex 348 n. 3; 
flat 145; locally Minkowskian 145; or- 
ientable 257; parallelizable 93, 99, 265; 
product 258; real n-dimensional C*- 
differentiable 257. See also under Affine, 
Conformal, Projective, Riemannian 

Manifold-with-boundary 325 n. 32 

Mass 9, 107; inertial vs. gravitational 32, 
133, 134, 135, 310 n. 2, n. 5, n. 10; 
longitudinal 13; proper 110; relativis- 
tic 112; transverse 113, 307 n. 16 

Mass-energy equivalence 111f. 306 n. 11, 
n. 13, 308 n. 15 

Maximum signal speed 71, 82, 338 n. 10 


Maxwellequations 36,39, 43, 50, 55, 109, 116, 
289 n. 10, 297 n. 9, 291 n. 8, 293 n. 8, 306 
n. 2, 307 n. 5, 323 n. 2, 327 n. 11 

Mercury’s perihelion precession 131, 169, 
173, 322 n. 24 

Metric on a set (Griinbaum) 242 

Metric space 303 n. 1; completion 211 

Metrical amorphousness and continuity 242 

Microwave background radiation 204, 255 

Minkowski metric 95; local approximation 
to global spacetime metric 150-2 

Minkowski spacetime 90, 92, 95, 186 

Minkowski’s world 89, 303 n. 10 

mirf (momentary inertial rest frame) 96 

Momentum: Newtonian 107; relativistic 112 

Motion 8; free vs. forced 30, 96, 158, 194 


n-ad 6, 99; orthonormal 279, 303 n. 1 

n-ad field 6, 99; holonomic 262 

n-ads, principal bundle of 264 f., canonic 
form 272 

Newton’s programme for natural philo- 
sophy 8, 35 

Newton’s wisdom, as judged by Einstein 328 
n. 8 

Normal chart 278 

“Now” (to nun) 220, 248, 249 

Null cone: in R* 92, 303 n. 2; in Minkowski 
spacetime 92 

Null curve 94 

Null vector 92, 94 


Observationism: embraced by Einstein 
(1916) 328 n. 4; dismissed by Einstein 
(1933) 231, (1937) 331 n. 39 

Orbit of an object under group action 295 
n. 14 

Orthogonal (in Minkowski geometry) 91 f. 

Orthonormal basis of inner product 
space 303 n. | 


Parallel cross-section 271 

Parallel transport 101, 187 f., 270 

Parallelism: in affine space 91; in principal 
fibre bundle 270; of vectors in affine 
manifold 187; path-dependent 187 f., 
266, 270, 326 n. 4 

Parallelizable manifold 93, 99, 265 

Parameter 259 

Parametric line of a coordinate function 259 

Parity 288 n. 9, n. 12 

Past directed causal curve 124 

Past distinguishing spacetime 217 

Past set 217 


Path 6, 94, 259 
Path topology 129 
Permutation 24 
pk (purely kinematic) coordinate transform- 
ation: Galilei 29;Lorentz 61 f.,65, 295 
n. 15 
Physical objects as geometrical objects 104 
Poincaré group 6, 62 
Point transformation: Galilei 
62 
Poisson’s equation 33, 34, 133, 159 
“Possible event” 22 
Power as inner product of velocity and relative 
force 113 
Poynting vector 
Principles: 
of Chronology 82, 300 n. 13 
of Coexistence according to the Law of 
Interaction (Kant) 220 
of Conservation of 4-momentum 114 
of Constancy of Light Velocity (Light 
Principle) 49, 54, 55, 67, 84, 138, 293 
n. 12, 294 n, 2 to Sect. 3.3, 295 n. 7, 311 
n.4 
of Equivalence 19; 32, 134-8, 151, 195, 
287n. 14, 310n. 5, 311 n. 1, n. 3, 315n. 35, 
316 n. 36; allegedly generalizes Relativity 
Principle 134; but actually modifies 
it 136 f.; extended to arbitrary gravi- 
tational fields 151; “semi-strong” 311 
n. 13 
of Inertia 12, 16, 17, 29, 51, 55 
of Reciprocity 79, 298 n. 4, n. 5 
of Relativity (General) 153, 195 
of Relativity (Newtonian) 19, 30-31, 39 
of Relativity (Special) 49, 54, 65 f., 76, 87, 
294 n. 2 to Sect. 3.3, 295 n. 18, 298 n. 5; 
compared with General Relativity 
Principle 153 f.; modified by Equiv- 
alence Principle 136 f.; variously for- 
mulated by Poincaré 83, 85, 301 n. 6, 
n.9 
of Spatial Isotropy 79 
of Temporal Succession according to the 
Law of Causality (Kant) 221 
of Thermodynamics 49, 83 
See also Cosmological principle, Mach’s 
principle 
Projective structure of a manifold 191 
Proper time 96 
Pseudo-group 316n. 6 
Pull-back 316 n. 12 


25; Lorentz 


116, 319 n. 6 


Quadratic form 303 n. 1} 


Radar chart 194 


INDEX 393 
Radar signal 53 
Reciprocity 79 
Reichenbach “parameter” 
function 225-6 
Reichenbach’s Rule for the definition of simul- 
taneity 223, 224 ff., 230, 338 n. 11, n. 14, 
339 n. 18 
Relative force 112 
Relativity: 
General 2, 152-8, 194, 197, 200 f., 240, 
241; dynamic formulation 181 
Newtonian vs. Einsteinian 38, 
Special 2, 48-82, 85, 107-21; incom- 


223-5; truly a 


patible with Newtonian gravity 131; lo- 
cally valid only 136 f.; transmutation of 
its laws into laws of General 
Relativity 156 f., 316n. 11 

“Relativity Theory of Poincaré and 
Lorentz” 83-7, 300-2, 350 

Repére 6,99; — mobile 6,99 

Ricci tensor 7, 159, 170, 173, 175, 189, 280, 
318 n. 29; components 281 (D. 4); 


Einstein’s notation of 1915 321 n. 11 

Riemann tensor 145, 189, 279, 280, 313 n. 8, 
318 n. 24, n. 26; components 281 (D. 2, 
D. 3) 

Riemannian geodesic 94 

Riemannian manifold 93, 283 n. 1; line el- 
ement 101; of constant curvature 333 
n. 15; proper 94 

Riemannian metric 6, 93, 100; compatible 
with linear connection 192; compon- 
ents 94, 279; conformal 192; distin- 
guished from other tensor fields 344 
n. 41; indefinite 93; negative defi- 
nite 93; positive definite 93; 
proper 6; signature 93 

Rigid body in Special Relativity 71; in 
General Relativity 239 

Robertson-Walker line element 207, 230 

Rods as standards of length in General 
Relativity 239, 315 n. 25, n. 26 

Rod, infinitesimal 239 f. 

Rotating disk 147-9, 314 n. 22, 315 n. 25, 
n. 26 

Rotating finite homogeneous universe 331 
n. 36 

Rotation in General Relativity 200 f. 

RP (Einstein's Relativity Principle of 
1905) 54. See also under Principle of 
Relativity (Special) 


Scalar field 99, 258 

Schmidt completion of relativistic space- 
time 217, 324 n. 10, 337 n. 22 

Schwarzschild mass 325 n. 33 


394 INDEX 

Schwarzwehild solution of Einstein field equa- 
tions 322 n. 22, 335 n. 2 

Self-measured speed (Bridgman) 298 n. 4 

Self-parallel curve 189 

Semi-Riemannian metric 6 

Separate worldpoints 121 

Signature: of inner product 
Riemannian metric 93 

Simultaneity 83 f., 220-30, 294 n. 5, 338- 
40; conventionality of 51, 54, 82, 84, 
220—30, 293 n. 1; James Thomson’s view 
of 294n.5;Malament’stheorem 229- 
30; ‘metrical’ 223, 338 on. 5; 
Reichenbach’s Rule for definition 
of 223, 230; relativity of 66 f., 98; 
“topological” 222, 338 n. 8; ‘“‘true” sim- 
ultaneity allegedly established by signals 
of known speed 339 n. 25 

Simultaneity class 66 

Simultaneous 27, 54, 66, 220, 223, 229, 339 
n. 23 

Singularity: in classical field theory 210; of 
complex-valued function 334 n. 1; of 
relativistic spacetime 178, 210-19, 324 
n. 10, 335 n. 6, 336 n. 10, 337 n. 20; 
crushing 219; particles as ‘‘moving sin- 
gularities” 178 f. . 

Singularity theorems 213-15, 336 n. 13-16 

de Sitter’s empty universe 199, 203 f., 328 
n, 28, n. 29, 329 n. 27 

Six-vector 102, 103 

Space 1, 2; elliptical 329 n. 24; New- 
tonian 10 f., 285 n. 10; Newton’s struc- 
turalist view of 285 n. 7; relative space of 
arigid frame 14; Riemann’s view 313 
n. 1, 344 on. 44; spherical 
(‘Riemann’) 329 n. 22, n. 24. See also 
Affine space, Homogeneous space 

Spacelike: curve 94; part of 4-vector 104; 
vector 92, 94; worldline (in Newtonian 
spacetime) 27 

Spacetime 21-3, 74, 146, 167 f. 

Newtonian 21, 23-31, 191, 288 n. 16 


303 n. 1; of 


Minkowski 20, 88-93, 95-8, 121-9, 145, 
194; alternative definitions 90, 92, 95 
Relativistic (general) 146, 178, 251 f.; 


causal structure 192, 247 f.; causality 
conditions 337 n. 23; differentiable 
structure constructed from particle be- 
haviour 193 f.; embeddability in R° 
(lower bounds for n) 326 n. 7; metric 
determined by conformal and projective 
structure 192 f.; observationally indis- 
tinguishable 347 n. 13; past/future dis- 
tinguishing 217, 337 n. 23; singular 
210-19, 335 n. 6, 336 n. 10, 337 n. 20; 
singular spacetime condemned by 


Einstein 178; stably causal 337 n. 23; 
time-orientable 251, 346 n. 10; universal 
covering space exists always 252 
Straight: in affine space 91; in Minkowski 
spacetime 309 n. 10 
Stress-energy tensor see Energy tensor 
Stress tensor, mechanical 118 
Structure constants of a Lie group 269 
Submanifold 261; open 258 
Synchronism see under Clock synchroniz- 
ation, Simultaneity 


Tachyon 71, 296 n. 10 
Tangent bundle 99, 265 
Tangent space 260, 262 


Tangent vector 259 
Tensor: at a point 99; comparison of tensors 
at different points 101; contravari- 


ant 100; covariant 100; mixed 100, 
304 n. 2; skew-symmetric 99; sym- 
metric 99 

Tensor bundle 99 

Tensor density 317n. 18 


Tensor field 99, 304 n.3, 305 n.7; 
Cartesian 103; components relative to a 
chart 100; components relative to n-ad 
field 100; contraction of 394 n. 15; 
time-dependent Cartesian 103 

Tensor product 349 n. 15 

Tetrad 6; 

Tetrad field 6 

Thermodynamics as phenomenological 
theory 49 

Thomas precession 295 n. 15 

TIF (terminal indecomposable future 
set) 218 


TIP (terminal indecomposable past set) 218 

Time 11 ff., 220 ff., 247 ff; allegedly 
spatialized in Minkowski geometry 98; 
as fourth dimension 22; Einstein time 
for an inertial frame 53 f.; local 
(Lorentz) 44, 292 n. 27; local 
(Poincaré) 300 n. 5; Newtonian 12, 28, 
50; proper 96, 138; universal 12, 13, 
69, 205, 228, 254, 293 n. 1 

Time coordinate function 15, 53; a global 
one only possible in a stably causal space- 
time 205, 228, 254 

Time dilation, relativistic 68 f., 98, 296 

“Time from the Creation of the World” 
(Friedmann) 209-10, 212-13, 334n. 35 

Time reversal 288 n. 12 

Time scale, inertial 17, 53 f., 225 

Timelike: curve 94; part of 4-vector 105; 
vector 92, 94; worldline (in Newtonian 
spacetime) 27 


Topology: induced in subspace 309 n. 15; 
induced by metric 303 n. 1. See also 
Alexandrov t., Compact, Connected 
Hausdorff t., Path t., Zeeman t. 

Torsion 275,276; components 276 (C.6.8) 

Torsion form 273 

Transformation see under Coordinate t., 
Lorentz t., Point t., Velocities, t. of 

Trapped surface 214 

Twin paradox 296 


Unified theories of gravitation and 
electricity 189-91; Einstein’s opinion of 
Weyl’s 1918 theory 327 n. 9, n. 10 


Variation 304 n. 6 

Variational principles in Einstein’s theories of 
gravity 169, 171, 175 

Vector: at a point 99, 260; components rela- 
tive to an m-ad = 272, 305 n. 6; horizon- 
tal 270; vertical 269 

Vector field 260, 265, 348 n. 7; complete 260; 
left-invariant 268; standard __ hori- 
zontal 273 


395 


Vector parallelism 187 f., 265, 266, 326 n. 4 

Velocities, transformation of: Einstein 
Rule 69f., 78, 82; Galilei Rule 29, 49, 
82, 104 

Velocity 105 

Velocity group of collinearly moving inertial 
frames 78 

Vorticity 331 n. 35 


INDEX 


Water-bucket experiment (Newton) 11, 285 
n. 11, 328 n. 9 

Weber’s electrodynamics 36 

Weyl tensor 282 

Winnie chart 227 

World- (prefix) 7, 157 

World-acceleration 157 

Worldforce 237 

Worldline 7, 23, 96 

Worldpoint 23, 89, 303 n. 10 

World-tensor 157 


Zeeman topology 128 
Zeeman’s Theorem 127, 227, 228 


